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Abstract 

The  Air  Force  Weather  Agency  (AFWA)  currently  uses  an  algorithm  to  calculate 
surface  temperatures  from  microwave  observations  taken  by  the  Special  Sensor 
Microwave  Imager  (SSM/I)  aboard  the  orbiting  platforms  of  the  Defense  Meteorological 
Satellite  Program  (DMSP).  This  algorithm,  called  the  Calibration-Validation  (CV) 
algorithm,  uses  multiple  linear  regression  to  calculate  coefficients  relating  microwave 
brightness  temperatures  and  land  surface  temperatures.  Because  the  coefficients  in  this 
algorithm  do  not  take  into  account  the  identity  of  the  individual  satellite,  the  question 
arose  whether  this  assumption  was  valid. 

This  thesis  used  multiple  linear  regression,  stepwise  linear  regression,  and 
qualitative  regression  on  3700  data  sets  from  October  of  1996  and  September  of  1997, 
including  microwave  brightness  temperatures  from  three  satellites.  This  data  was 
analyzed  to  determine  if  satellite  identity  had  a  significant  impact  on  CV  regression 
coefficients.  Analysis  indicated  that  satellite  identity  does  not  have  a  significant  impact 
on  regression  coefficients  for  five  of  the  eight  CV  land  types  investigated.  Analysis  of 
two  CV  land  types  indicated  data  set  identity  had  a  significant  impact,  while  there  was 
insufficient  data  to  determine  the  impact  for  one  CV  land  type. 

In  addition  to  the  qualitative  regression,  stepwise  linear  regression  was  performed 
on  five  land  type  categories  using  combined  data  for  all  satellites.  Regressed  RMSEs 
ranged  from  2.825  K  to  3.743  K,  while  R  squared  values  ranged  from  .7295  to  .8613. 
Preliminary  analysis  indicated  refinement  of  CV  brightness  temperature  coefficients 
might  yield  better  accuracy  for  the  algorithm. 
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A  REFINEMENT  AND  CROSS-VALIDATION  OF 
THE  SPECIAL  SENSOR  MICROWAVE  IMAGER  (SSM/I) 
CALIBRATION/VALIDATION  (CV)  BRIGHTNESS 
TEMPERATURE  ALGORITHM 


L  Introduction 


Chapter  Overview 

This  chapter  introduces  satellite  passive  microwave  remote  sensing,  its  use  in 
determination  of  land  surface  temperatures,  and  an  explanation  why  improved  accuracy 
would  be  beneficial  in  scientific  and  military  applications.  A  background  section  will 
identify  the  two  space  platforms  used  to  obtain  microwave  brightness  temperature 
measurements  and  the  conversion  algorithm  refined  and  cross-validated  in  this  study. 

The  research  problem,  assumptions,  and  general  research  approach  will  be  outlined. 
Finally,  the  chapter  briefly  summarizes  the  results  of  the  research. 

Introduction 

Until  recently,  the  only  method  of  determining  surface  temperature  at  a  given 
location  was  to  place  a  thermometer  there  and  have  a  human  read  it.  Obviously,  this  is 
not  always  possible,  especially  in  sparsely  populated  or  data  denied  areas.  With  the 
increased  emphasis  on  numerical  weather  prediction,  it  has  become  important  to 
determine  temperatures  in  the  very  regions  where  such  in  situ  temperature  measurements 
are  few  and  far  between.  If  we  can  measure  terrestrial  emissions  via  satellite  and  develop 
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an  accurate  algorithm  for  calculating  surface  temperature,  it  will  greatly  aid  in  initializing 
computer  atmospheric  models.  In  addition,  military  commanders  will  have  better  access 
to  information  for  their  Intelligence  Preparation  of  the  Battlefield  (IPB)  assessments,  for 
example  allowing  for  more  accurate  weapons  lock-on  range  estimates  by  such 
temperature-dependent  computer  programs  as  the  Electro-optical  Tactical  Decision  Aid 
(EOTDA).  One  attempt  to  measure  temperatures  via  remote  sensing  is  the 
Calibration/Validation  (CV)  Algorithm,  adapted  to  the  Special  Sensor  Microwave  Imager 
(SSM/I)  aboard  the  F13  Defense  Meteorological  Satellite  Program  (DMSP)  Satellite.  If 
research  can  be  performed  to  improve  the  accuracy  of  the  algorithm,  as  well  as  to  cross- 
validate  the  algorithm  on  other  DMSP  satellites,  the  benefits  to  atmospheric  modelers  and 
military  commanders  will  be  that  much  more. 

Background 

Before  we  delve  into  the  specifics  of  the  algorithm  and  the  passive  microwave 
imager,  we  must  understand  why  the  microwave  spectrum  would  be  the  best  choice  for 
determining  surface  temperature.  An  important  criterion  is  that  terrestrial  emission 
measurements  not  be  significantly  contaminated  by  extraterrestrial  sources,  such  as  the 
sun.  Thus,  our  measurements  must  be  obtained  within  the  long  (wavelength)  end  of  the 
electromagnetic  spectrum  (Rees,  1990).  A  prima  facie  choice  in  this  range  might  be  the 
infrared  (IR)  spectrum;  however,  clouds  have  a  high  albedo  in  this  range  (Rees,  1990), 
thus  making  daytime  surface  measurements  difficult  where  there  is  cloud  cover.  The 
microwave  spectrum  is  the  best  choice  for  four  reasons:  readings  are  not  affected  by  the 
sun’s  illumination;  microwave  radiation  penetrates  clouds;  microwave  radiation 
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penetrates  vegetation;  and  the  nature  of  microwave  radiation  allows  it  to  be  measured 
more  easily  from  a  spacebome  platform  (Ulaby,  1981). 

On  each  of  the  DMSP  platforms,  there  is  an  SSM/I.  This  thesis  will  concentrate 
on  three  satellites  carrying  this  imager  -  the  FI  3  satellite,  the  F10  satellite,  and  the  newer 
F14  satellite;  the  latter  was  launched  in  April  1997  from  Vandenberg  AFB,  California 
(Cooper,  1997).  Research  has  already  been  done  on  land  temperature  analysis  with  the 
FI  3  satellite  (Harris,  1998;  and  Comoglio,  1997),  including  a  comparison  of  two  different 
brightness  temperature  algorithms:  The  Calibration- Validation  (CV)  algorithm  and  the 
TMPSMI  (TS)  algorithm.  This  study  will  concentrate  on  improving  regression 
coefficients  of  the  CV  algorithm  because  the  CV  algorithm  has  a  higher  production  rate 
and  met  Air  Force  Weather  Agency  (AFWA)  accuracy  criteria  more  often  than  the  TS 
algorithm  (Harris,  1998). 

Problem  and  Assumptions 

The  focus  questions  of  this  research  are  twofold:  first,  if  regression  techniques  can 
fine-tune  the  CV  algorithm’s  coefficients  to  yield  greater  accuracy;  and  second,  if  the 
“true”  brightness  temperature  -  surface  air  temperature  regression  coefficients  derived  for 
one  set  of  passive  microwave  sensors  (F10/F13)  are  identical  to  those  of  other  identical 
sensors  (e.g.  FI 4),  or  is  there  a  statistically  significant  difference  in  “true”  regression 
coefficients  between  the  sensors. 

This  research  does  not  attempt  to  refine  the  land  type  determination  part  of  the 
CV  algorithm;  it  is  assumed  the  techniques  and  code  used  are  accurate.  Another 
assumption  is  that  the  synoptic  observations  of  surface  temperature  used  in  this  research 
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are  accurate.  Finally,  we  will  assume  that  all  sources  of  error  (explained  more  in  detail  in 
Chapter  II)  can  be  reduced  to  an  acceptable  level  using  statistical  methods  alone. 

Research  Scope  and  General  Approach 

This  research  will  only  seek  to  refine  coefficients  of  the  existing  CV  algorithm 
and  check  to  see  if  regressed  coefficients  can  be  used  interchangeably  on  brightness 
temperature  measurements  from  all  SSM/I  platforms.  Thanks  to  previous  research  in  this 
field  (Harris,  1998;  and  Comoglio,  1997),  there  already  exists  on  hand  a  number  of 
brightness  temperature  data  sets  from  the  F10  and  F13  satellites,  as  well  as  a  large 
number  of  surface  observations  from  around  the  world.  The  first  step  was  to  match 
F10/F13  brightness  temperature  measurements  obtained  at  frequencies  of  19.3  GHz 
(horizontal  and  vertical  polarizations),  22.2  GHz  (vertical  polarization),  37  GHz 
(horizontal  and  vertical  polarizations),  and  85.5  GHz  (vertical  and  horizontal 
polarizations),  with  surface  observation  data  by  location  and  time  to  form  data  sets. 

These  sets  were  sorted  by  season,  land  type,  and  region.  The  matching  and  sorting  was 
accomplished  by  adapting  FORTRAN  code  written  by  Harris  and  Comoglio  and  by 
adapting  the  land  type-sorting  algorithm  from  the  CV  code.  The  data  was  then 
transferred  to  a  PC  and  statistical  analysis  was  performed  using  the  commercial  software 
package  S  Plus  4.5.  by  MathSoft,  Inc,  Cambridge,  Massachusetts. 

After  the  F10/F13  data  was  analyzed,  Similar  F14  SSM/I  data  was  matched  with 
synoptic  observation  data  acquired  from  the  Air  Force  Combat  Climatology  Center 
(AFCCC)  and  sorted  the  data  sets  by  CV-determined  land  type.  A  Bernoulli  indicator 
variable  (Neter  et  al.,  1983)  was  then  assigned  to  each  data  point  by  satellite 
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identification:  a  value  of  0  was  assigned  to  Harris’s  F10/F13  data  and  a  value  of  1  was 
assigned  to  the  new  F14  data. 

Results 

Some  4,000  usable  F10/F13  matches  (what  constitutes  a  “match”  is  explained  in 
Chapter  III)  were  found  and  divided  into  eighteen  groups:  by  seasons  (Fall  96  and  Winter 
97)  and  by  one  of  eight  different  land  types,  plus  an  “undetermined  land  type”  category. 
Fourteen  of  eighteen  groups  had  more  than  30  data  points,  thus  allowing  us  to  invoke  the 
Central  Limit  Theorem  (Devore,  1991),  and  therefore  justifying  the  use  of  linear 
regression.  Multiple  Linear  Regression  on  these  groups  yielded  root  mean  square  errors 
(RMSE)  between  2.55  K  and  4.58  K,  with  explanatory  power  (R  squared)  between  0.616 
and  0.803.  This  data  compares  quite  favorably  to  RMSE  values  ranging  from  5.3  K 
tol9.4  K  in  Harris’s  analysis  using  the  original  CV  coefficients.  The  error  in  the  current 
research  was  lowest  for  the  Desert  land  type  in  Winter  97.  The  lowest  RMSE  in  the  Fall 
96  data  set  was  for  “Light  Vegetation”  (RMSE  3.13  K).  Highest  errors  were  found  in  the 
“Undetermined  Land  Type”  category  and  “Wet  Soil”  categories  in  both  seasons.  Upon 
closer  analysis,  it  was  determined  that  the  “Wet  Soil”  land  type  in  Fall  96  had  a 
significantly  lower  RMSE  and  higher  explanatory  power  when  non-CONUS  data  points 
were  excluded  (4.28  K  /  0.625  to  3.41  K  /  0.688).  The  other  categories  only  showed 
slight  changes  in  RMSE  when  non-CONUS  data  points  were  excluded,  though  the 
“Undetermined  Land  Type”  category  for  Fall  96  showed  a  considerable  increase  in 
explanatory  power  (0.616  to  0.724). 
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From  the  F14  data,  2,581  usable  matches  were  found  from  September  1997. 

Since  only  474  of  these  matches  came  from  outside  CONUS,  the  data  was  not  separated 
by  region.  These  matches  were  then  combined  with  1,119  data  points  from  Harris’s 
F10/F13  data  from  October  1996  and  performed  a  qualitative  regression  using  the 
Bernoulli  indicator  variable  described  earlier.  This  analysis  indicated  satellite  identity 
did  not  have  a  statistically  significant  impact  upon  the  regression  coefficients  for  the 
“Moist  Soil,”  “Dense  Vegetation,”  “Light  Vegetation,”  “Desert,”  and  “Dry,  Arable  Soil” 
land  types.  Satellite  identity  did  have  a  statistically  significant  impact  upon  the 
regression  coefficients  for  the  “Semidesert”  and  “Wet  Soil”  land  types,  as  well  as  for 
those  data  sets  which  the  CV  algorithm  could  not  confirm  a  land  type.  A  determination 
could  not  be  made  for  the  “Mixed  Water  and  Vegetation”  land  type,  as  there  were  no  data 
points  in  this  category. 

For  the  five  land  types  for  which  satellite  identity  did  not  have  an  impact,  new 
regressions  (sans  indicator  variable)  of  the  combined  F10/F13/F14  data  sets  were 
performed.  RMSEs  from  the  new  regression  ranged  from  2.825  K  for  the  “Light 
Vegetation”  land  type  to  3.743  K  for  the  “Desert”  land  type.  R  squared  values  ranged 
from  .7295  for  “Desert”  to  .8613  for  “Dense  Vegetation.” 

To  confirm  the  qualitative  results  of  the  regressions,  the  F10/F13  and  F14  data 
were  regressed  separately  and  each  data  set  was  cross- validated  using  the  regressed 
coefficient  from  the  other  data  set.  Results  largely  confirmed  the  qualitative  regression, 
with  root  MSPR  values  ranging  from  2.687  K  for  the  F14  “Light  Vegetation”  data 
validated  with  the  F10/F13  coefficients,  to  4.773  K  for  the  F14  “Desert”  data  validated 
with  the  F10/F13  coefficients.  Similar  validation  of  the  rejected  land  types  confirmed  the 
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rejection,  with  root  MSPR  values  ranging  from  5.22  K  for  F14  “semidesert”  data 
validated  with  F10/F13  data,  to  7.072  K  for  F10/F13  “semidesert”  data  validated  with 
F14  coefficients. 

Summary 

The  purpose  of  this  research  is  to  refine  the  CV  coefficients  for  improved 
accuracy  in  retrieved  surface  temperature  measurements  and  to  determine  the 
compatibility  of  data  from  different  SSM/I  platforms  for  the  purpose  of  deriving  Multiple 
Linear  Regression  Coefficients.  Multiple  Linear  Regression  analysis  was  used  on 
Harris’s  data  to  re-derive  CV  coefficients,  reducing  the  RMSE  values  by  significant 
amounts  over  the  original  CV  coefficients.  F14  data  was  then  matched,  sorted,  and 
combined  with  Harris’s  data.  A  qualitative  regression  was  then  performed  to  determine  if 
satellite  identity  was  an  important  factor  in  the  regressions.  The  research  indicated  that 
the  platform  was  not  a  significant  contributor  for  5  of  the  8  land  types.  Satellite  identity 
was  significant  for  2  land  types,  as  well  as  data  points  for  which  the  CV  algorithm  could 
not  determine  a  land  type.  There  was  insufficient  data  for  regression  of  one  of  the  land 
types.  Cross-validation  of  the  data  largely  confirmed  these  results. 
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II.  Background 


Chapter  Overview 

In  this  chapter,  the  physics  of  electromagnetic  radiation  will  be  introduced, 
including  Planck’s  Law  and  the  applicability  of  the  Rayleigh- Jeans  Approximation  in  the 
microwave  region  of  the  spectrum.  After  the  introduction,  the  history  of  microwave 
remote  sensing  will  be  discussed  briefly.  Then,  the  Defense  Meteorological  Satellite 
Program  (DMSP)  satellites  and  the  SSM/I  sensors  particular  to  the  study  will  be 
discussed.  At  this  point,  an  equation  relating  surface  temperature  to  the  amount  of 
microwave  radiation  emitted  will  be  described.  It  will  then  be  shown  that  the  equation 
cannot  be  solved  analytically  due  to  the  inverse  nature  of  the  problem.  However,  due  to 
large  amounts  of  data  available,  statistical  methods  such  as  multiple  linear  regression  can 
be  used  to  estimate  the  actual  relationship  between  the  brightness  temperatures  measured 
by  the  satellite  and  the  observed  surface  temperatures.  The  Calibration/V alidation  (CV) 
algorithm  is  one  such  attempt  to  estimate  surface  temperatures  in  such  a  way.  After 
outlining  the  advantages  of  refining  CV  coefficients  and  cross- validating  algorithms  on 
data  from  different  DMSP  satellites,  potential  sources  of  error  in  determining  surface 
temperatures  with  passive  microwave  radiometry  will  be  outlined.  Finally,  a  short  review 
of  research  correlating  surface  temperatures  with  microwave  emissions  will  be  presented. 

Introduction 

All  objects  emit  energy  in  varying  intensities  throughout  the  electromagnetic 
spectrum.  The  intensity  of  this  emission  at  a  given  wavelength  is  a  function  of  the 
object’s  temperature  (Rees,  1990).  Since  we  are  interested  in  the  temperature  of  the 
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earth,  we  will  look  at  the  wavelength  distribution  of  radiation  for  a  body  of 
approximately  300K.  If  we  apply  Planck’s  Law  (Fleagle  and  Businger,  1980)  and  for  the 
purposes  of  illustration  equate  the  earth  to  a  black  body  of  300K  and  examine  the 
spectrum  of  its  emissions  (see  Figure  1),  we  see  that  the  peak  of  the  earth’s  radiation 
would  fall  between  3  and  15  micrometers,  the  infrared  (IR)  portion  of  the  spectrum. 
While  viewing  the  infrared  portion  allows  relatively  easy  temperature  determination,  the 
existence  of  clouds  makes  measurement  of  surface  temperatures  difficult  (although  the 
determination  of  the  temperature  of  cloud  tops  can  be  extremely  useful  for  other 
meteorological  purposes).  Because  of  this,  and  for  the  other  reasons  we  discussed 
earlier,  scientists  have  turned  to  the  microwave  portion  of  the  spectrum  for  surface 
temperature  measurement. 


Figure  1:  Planck  Radiance  by  Temperature  and  Wavelength 
(Adapted  from  Kidder  and  Yonder  Haar,  1995) 
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When  comparing  in  situ  synoptic  surface  temperature  measurements  with 
temperatures  calculated  from  remotely  sensed  measurements  of  terrestrial  emission  in  the 
microwave  portion  of  the  spectrum,  there  are  several  potential  sources  of  variance 
between  the  two  values.  One  fundamental  source  of  variance  arises  from  the  difference 
in  measurement  method.  In  a  standard  thermometer,  the  sensor  is  in  direct  contact  with 
the  air  and  determines  the  temperature  via  conduction,  i.e.  allowing  kinetic  energy  from 
the  surrounding  air  molecules  to  be  transferred  to  the  sensor.  In  contrast,  because  it  is  by 
definition  not  in  direct  contact  with  the  earth  or  its  atmosphere,  a  sensor  on  board  an 
orbiting  platform  must  be  a  remote  sensor.  Also,  a  satellite-based  instrument  designed  to 
sense  microwave  emissions  is  measuring  the  magnitude  of  radiation  coming  from  a 
particular  solid  angle  of  view.  As  such,  a  satellite  must  take  into  account  the  shape  of  the 
earth  and  the  incident  angle  of  the  earth’s  surface  and  atmosphere  relative  to  the  sensor 
(Ulaby,  1981). 

Radiative  Transfer 

Radiative  transfer  is  the  flow  of  energy  from  an  object  to  a  receiver  at  or  near  the 
speed  of  light.  While  there  are  equations  that  quantify  this  radiation  flow,  detailed 
exploration  of  these  equations  does  not  contribute  much  to  the  understanding  of  radiative 
transfer  we  need  for  this  research.  Therefore,  we  will  keep  this  discussion  as  qualitative 
as  possible.  Paraphrasing  Ulaby  (1981),  the  intensity  of  radiation  measured  at  a  sensor  is 
given  by  two  terms.  The  first  term  is  the  radiation  from  the  object  propagating  toward 
the  sensor.  A  negative  exponential  extinction  factor  due  to  absorption  by  material  (or  the 
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“medium”)  reduces  the  magnitude  of  this  first  term  between  the  sensor  and  the  object. 

The  second  term  represents  the  emission  and  scattering  by  the  intervening  medium  along 
the  propagation  path. 

Before  we  proceed,  a  brief  word  on  polarization.  Polarization  is  the  orientation  of 
a  wave,  in  this  case  electromagnetic  waves,  relative  to  a  coordinate  system  (Rees,  1990). 
In  the  propagation  of  an  electromagnetic  wave,  the  electric  field  vector  and  magnetic 
intensity  vectors  are  orthogonal  (Fleagle  and  Businger,  1980).  As  a  result,  it  is  important 
for  a  sensor  to  take  into  account  two  polarizations  of  radiation  and  use  vector  addition  to 
be  able  to  interpret  a  full  picture  of  an  electromagnetic  wave.  Finally,  due  to  sign 
conventions,  it  is  possible  that  a  reading  in  a  given  polarization  may  have  a  negative 
contribution  to  the  intensity  of  the  electromagnetic  wave.  Therefore,  we  should  not  be 
surprised  if  regression  yields  oppositely  signed  coefficients  for  two  polarizations  in  the 
same  frequency  range. 

Now  that  a  simplified  explanation  the  physics  of  radiative  transfer  has  been  given, 
let  us  simplify  the  scenario  even  further:  suppose  there  is  no  intervening  material  between 
object  and  sensor.  This  removes  the  extinction  factor  from  the  first  term,  and  the  second 
term  altogether,  which  leaves  us  with  a  highly  desirable  result:  brightness  temperature 
measured  at  the  sensor  is  equal  to  the  brightness  temperature  (and  ultimately,  the  physical 
temperature)  of  the  object. 

As  a  final  simplification,  let  us  now  assume  the  object  is  a  blackbody,  i.e.  an 
idealized,  perfectly  opaque  material  that  absorbs  all  radiation  at  all  frequencies,  reflecting 
none,  and  emits  radiation  such  that  its  temperature  neither  increases  nor  decreases  as  a 
result  (Ulaby,  1981).  At  this  point,  we  can  now  invoke  Planck’s  radiation  law: 
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Bv  (T)  =  (2h  v3  /  c2  )(1  /(exp (h  W  kT )  - 1)) 


(1) 


Where  Bv  =  Surface  brightness  (or  radiance)  (W  m'2  sr'1  Hz'1) 
v  =  Frequency  (Hz) 

T  =  Physical  temperature  (K) 
h  =  Planck’s  Constant  (6.6256  x  10'34  J  s) 
k  =  Boltzmann’s  Constant  (1.3805  x  10"23  J  K'1) 
c  =  Speed  of  light  (2.9979  x  108  m  s'1) 

Bv  does  not  have  a  physical  meaning  unless  the  equation  is  integrated  over  a  range  of 

frequencies.  Normally,  monochromatic  radiance  is  measured  in  units  of  Watts  per  meter 
squared  per  steradian  per  Hertz,  implying  integration  over  the  frequency  bandwidth  of  the 
sensor  channel. 

After  all  of  our  simplifications,  we  are  still  left  with  a  rather  troublesome  equation 
to  evaluate.  It  would  be  very  nice  if  we  could  somehow  reduce  equation  (1)  to  a 
polynomial.  One  way  to  change  an  exponential  into  a  polynomial  is  to  convert  the 
exponential  into  a  Taylor  polynomial  expansion. 

The  Rayleigh-Jeans  Approximation 

Even  for  those  of  us  who  are  not  adept  at  mathematics,  there  is  a  sense  of 
satisfaction  when  a  mathematical  technique  can  be  used  to  simplify  a  problem.  In  the 
case  of  equation  (1),  we  can  expand  the  exponential  using  the  Taylor  expansion: 


ex  =  1  +  x  +  (x2  /2!)  +  (x3  / 3!)  +  ... 


(2) 


In  this  case,  x  =  hv/kT.  If  we  now  perform  a  dimensional  analysis  of  the  value  of  hv/kT 
in  the  case  of  T  ~  100K  and  frequency  in  the  microwave  region  (i.e.  approximately  100 
GHz)  we  get: 

hfrAT  s  (10-34  -lO'^/ClO'23  -102) ~  10-2  «1  (3) 

Thus,  for  the  magnitudes  in  question  for  this  research,  we  can  neglect  the  higher  order 
terms  of  hv/kT,  leaving  us  with  the  approximation  ex  =  1  -  x .  This  gives  us  when  we 
plug  back  into  equation  (1): 

By(T)  =  (2hv3 /c2)(l/((l-hv/kT)-V))  =  2hv3kT/hvc2  =2v2kT/c2  (4) 

(for  integration  over  frequency) 

Similarly,  the  approximation  can  be  applied  to  the  wavelength  form  of  Planck’s  Law: 

Bx  (T)  =  2kT  /  A2  (5a) 

for  integration  over  frequency,  or 

Bx(T)  =  2ckT/A4  (5b) 

for  integration  over  wavelength. 

A  very  significant  use  of  this  approximation  is  that  microwave  monochromatic 
radiance  is  directly  proportional  to  temperature  in  equations  (4)  and  (5).  Our  dimensional 
analysis  indicated  that  the  higher  order  terms  could  be  safely  neglected.  Indeed,  Ulaby 
mentions  that  the  Rayleigh-Jeans  approximation  yields  values  within  1%  of  the  Planck 
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equation  for  frequencies  of  less  than  117  GHz.  Since  the  highest  frequency  observed  by 
the  sensor  in  this  research  is  87  GHz,  it  should  be  acceptable  to  use  this  approximation. 

Non-Blackbody  and  Atmospheric  Effects 

Given  equation  (4)  above,  it  should  now  be  a  simple  matter  of  choosing  a 
frequency  in  the  microwave  range,  correlating  the  observed  brightness  temperature  with 
surface  temperature,  calculating  the  coefficient  2v2k/c2,  and  disseminating  flawless 
surface  temperature  readings  based  on  microwave  radiation.  However,  equation  (4)  is  an 
approximation.  We  must  now  step  back  and  examine  the  flaws  in  our  assumptions. 

The  first  problem  is,  the  earth  is  not  a  blackbody.  That  is,  it  does  not  absorb  all 
radiation  perfectly  and  emit  the  radiation  perfectly  in  accordance  with  Kirchhoff  s  Law 
(Fleagle  and  Businger,  1980).  To  describe  the  emission  properties  of  a  non-black  body,  a 
coefficient  of  emissivity  8  is  normally  used  to  relate  the  actual  radiance  of  a  body  at  a 
given  temperature  to  the  amount  the  body  would  radiate  if  it  were  a  black  body  (Fleagle 
and  Businger,  1980).  This  itself  would  not  be  so  bad  if  emissivity  were  a  single 
constant,  but  it  is  not.  Emissivities  vary  as  a  function  of  wavelength  and  temperature 
(Fleagle  and  Businger,  1980).  Worse  still,  emissivities  can  change  considerably  based  on 
the  molecular  structure  of  the  surface  (Ulaby,  1986).  While  the  latter  can  be  overcome 
when  the  surface  sensed  is  homogeneous,  it  can  become  quite  troublesome  for  composite 
surfaces,  such  as  those  which  are  on  land  (McFarland,  1991). 

Further  complicating  the  matter  is  that  a  non-blackbody  surface  reflects  radiation 
it  does  not  absorb.  Assuming  no  radiation  is  transmitted  through  the  body,  the  emissivity 
plus  the  reflectivity  equal  one.  In  other  words,  if  a  body  has  an  emissivity  of  .65,  it 
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would  have  a  reflectivity  of  .35.  The  result  of  this  is  to  introduce  another  factor  that  we 
must  take  into  account  in  a  radiative  transfer  equation. 

Thus,  emissivities  can  vary  considerably  on  land  in  both  space  and  time.  The 
composition  of  the  earth’s  surface  can  vary  considerably  in  a  few  kilometers,  while 
precipitation  can  alter  emissivities  and  resultant  detected  radiation  considerably  over  a 
short  time.  Indeed,  research  has  been  done  to  correlate  rainfall  rates  with  reductions  in 
detected  radiation  (Conner  and  Petty,  1998).  This  brings  us  to  another  factor  that  affects 
the  amount  of  radiation  reaching  a  sensor:  the  intervening  atmosphere. 

While  we  made  the  original  assumption  that  all  radiation  emitted  in  the  direction 
of  the  sensor  would  reach  that  sensor,  clearly  this  is  not  the  case.  There  is  still  the  matter 
of  absorption,  emission  and  scattering  of  microwave  radiation  by  the  atmosphere. 
Fortunately,  since  the  composition  of  the  atmosphere  is  relatively  constant  up  to  90km 
above  sea  level  with  the  exception  of  water  vapor  content  (Ulaby,  1981),  most 
attenuation  due  to  absorption  can  be  taken  into  account  relatively  easily.  The  remaining 
variances  we  must  take  into  account  come  from  the  water  vapor  absorption  and  emission 
in  the  22.2  GHz  and  183.3  GHz  bands  (Ulaby,  1981)  and  scattering  due  to  suspended 
water  droplets  and  other  hydrometeors  (Ulaby,  1981). 

Finally,  we  implicitly  made  the  assumption  that  the  surface  of  the  earth  was 
smooth  and  that  radiation  would  be  transmitted  toward  the  sensor  isotropically.  Because 
the  earth’s  surface  is  rough,  the  angle  at  which  the  surface  will  radiate  will  not  always  be 
directly  away  from  the  center  of  the  earth;  parts  of  the  surface  will  radiate  at  varying 
intensities  relative  to  the  vector  from  the  earth’s  surface  to  the  sensor.  The  result  is 
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diffusion  of  the  radiation,  resulting  in  a  reduction  in  the  radiance  sensed  by  a  radiometer 
(Ulaby,  1981). 

Now  that  the  nature  of  the  problem  has  been  explained,  we  will  now  take  some 
time  to  discuss  the  history  of  remote  sensing  which  leads  us  to  the  present  state  of  Passive 
Microwave  Radiometry.  The  information  in  the  next  section  is  paraphrased  from 
Section  1-2  of  Ulaby  (1981). 

Historical  Context 

Microwave  radiometry  can  be  traced  back  as  far  as  Heinrich  Hertz’s  first  radio 
experiments  in  1886.  In  a  test  of  Maxwell’s  electromagnetic  theory,  Hertz  constructed 
resonators  at  a  frequency  of 200  MHz,  which  is  quite  close  to  the  microwave  portion  of 
the  spectrum.  There  followed  in  the  early  20th  Century  many  experiments  in  the  radio 
portion  of  the  spectrum,  involving  continuous  and  pulse  wave  radio  detection  and  ranging 
(RADAR)  devices.  In  the  1920s,  the  U.S.  Naval  Research  Laboratory  conducted 
experiments  to  detect  ships  and  aircraft,  while  other  researchers  used  radio  pulses  to 
measure  the  height  of  the  ionosphere. 

The  development  of  radar  continued  in  the  1930s  and  1940s.  The  advent  of 
World  War  II  hastened  radar  development,  including  a  long-wave  system  that  was 
deployed  in  aircraft.  By  1946,  radars  operating  at  frequencies  of  3,  10  and  24  GHz  were 
in  service  and  producing  images  of  the  ground.  It  was  noted  that  the  24  GHz  band  was 
not  always  effective  because  of  the  tendency  of  water  vapor  to  absorb  radiation  of  that 
frequency  (24  GHz  is  close  to  the  22  GHz  water  vapor  absorption  feature;  pressure 
broadening  leads  to  significant  absorption  at  24  GHz  as  well). 


16 


By  the  1950s,  the  side-looking  airborne  radar  (SLAR)  was  developed.  With  this 
radar  and  its  long  antenna,  images  produced  were  of  finer  resolution.  Most  SLAR 
systems  operated  at  frequencies  of  10  GHz,  16  GHz,  and  35  GHz,  though  some  operated 
at  even  higher  frequencies.  One  of  these  systems,  the  AN/APQ-97,  was  declassified  in 
1964.  This  images  created  from  this  system  over  the  CONUS  in  1965  and  1966  were  still 
being  studied  decades  later. 

Another  major  improvement  was  the  development  of  synthetic  aperture  radar, 
which  improved  image  resolution  even  further.  With  the  dawn  of  the  Space  Age, 
proposals  were  drawn  up  to  place  synthetic  aperture  radars  into  space.  The  lag  from 
proposal  to  action  was  considerable,  however,  as  the  first  such  radar  launched  was  on 
Seasat  in  June  of  1978. 

In  contrast  to  radar,  or  active  microwave  (mw)  remote  sensing,  is  passive  mw 
remote  sensing.  In  other  words,  rather  than  emitting  a  pulse  of  radiation  at  its  target  and 
recording  the  amount  of  radiation  reflected,  a  passive  sensor  measures  the  amount  of 
radiation  naturally  emitted  from  its  target.  The  first  spacebome  passive  microwave 
radiometer  (PMR)  to  acquire  data  did  not  acquire  data  from  Earth,  but  from  Venus.  In 
1962,  the  Mariner  2  space  probe  orbited  Venus  with  a  two-channel  microwave 
radiometer  aboard.  The  first  orbiting  platform  to  acquire  such  data  for  Earth  was  the 
Cosmos  243  satellite,  launched  by  the  Soviet  Union  in  1968.  The  first  American  satellite 
with  such  capabilities  was  Nimbus  5,  launched  by  the  National  Oceanographic  and 
Atmospheric  Administration  (NOAA)  in  1972. 

The  first  spacebome  PMR  systems  launched  by  the  US  military  were  incorporated 
in  its  series  of  Defense  Meteorological  Satellite  Program  (DMSP)  polar  orbiting 
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satellites,  the  first  of  which  was  launched  in  1978  and  was  used  to  recover  atmospheric 
temperature  profiles  (Ulaby,  1981).  The  First  Special  Sensor  Microwave  Imager 
(SSM/I),  the  radiometer  studied  in  this  research,  was  aboard  the  DMSP  satellite  F8, 
launched  in  1987  (Hesser,  1995).  Until  recently,  most  PMR  research  concentrated  on 
atmospheric  temperature  profiles  and  wind  speed  over  smooth  surfaces.  The  first  attempt 
to  correlate  PMR  data  with  surface  temperatures  was  by  McFarland,  et  al.  (1991)  in  their 
development  of  the  Calibration/V alidation  (CV)  algorithm,  which  sorted  data  by  land 
type  and  then  calculated  a  surface  temperature. 

Remote  sensing  of  land  surface  temperature  is  of  special  interest  to  the  military 
for  two  reasons.  First,  accurate  surface  temperatures  would  lead  to  improved  input  for 
computer  forecast  models.  Second,  the  ability  to  know  the  surface  temperature  gives  the 
ground  commander  a  tactical  advantage  in  preparation  for  action  in  a  data-sparse  or 
enemy  held  area.  For  example,  knowledge  of  the  land  surface  temperature  can  assist  in 
determining  the  times  at  which  infrared  sensors  will  effectively  detect  various  targets. 
Since  this  research  concentrates  on  the  use  of  the  SSM/I  to  this  end,  the  next  section 
provides  information  on  the  DMSP  program  and  the  SSM/I. 


DMSP  and  the  SSM/I 

DMSP  satellites  are  in  a  near  polar  orbiting,  sun  synchronous  orbit  at  an  altitude 
of  approximately  830  km  above  the  earth.  Each  satellite  provides  twice-daily  global 
coverage  and  has  an  orbital  period  of  about  101  minutes.  Visible  and  infrared  sensors 
collect  images  of  global  cloud  distribution  across  a  3,000  km  swath  during  both  daytime 
and  nighttime  conditions.  The  coverage  of  the  microwave  imager  is  one-half  that  of  the 
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visible  and  infrared  sensors,  thus  the  polar  regions  above  60°  latitude  are  imaged  on  a 
twice  daily  basis,  but  the  equatorial  region  are  viewed  on  a  daily  basis  (NGDC,  1998a). 

The  SSM/I  instrument  consists  of  an  offset  parabolic  reflector  that  is  24  x  26 
inches  fed  by  a  seven-port  hom  antenna.  The  reflector  and  feedhom  are  mounted  on  a 
rotating  drum  that  contains  the  radiometers,  digital  data  subsystem,  mechanical  scanning 
subsystem,  and  power  subsystem.  A  small  mirror  and  a  hot  reference  absorber  are 
mounted  on  the  assembly  for  calibration  purposes. 


Figure  2:  Schematic  of  a  DMSP  Satellite 

The  instrument  sweeps  a  45°  cone  around  the  satellite  velocity  vector  so  that  the 
Earth  incidence  angle  is  always  54°.  Data  are  recorded  when  the  antenna  beam  intercepts 
the  Earth's  surface.  The  channel  footprint  varies  with  channel  number  (or  frequency), 
position  in  the  scan,  along-scan  or  along-track  direction,  and  altitude  of  the  satellite.  The 
85  GHz  footprint  is  the  smallest  at  13  x  15  km  and  the  19  GHz  footprint  is  the  largest  at 
43  x  69  km.  Because  the  85  GHz  footprint  is  so  small,  it  is  sampled  twice  as  often,  i.e. 
128  times  a  scan.  One  data  cycle  consists  of  4  85  GHz  scans  and  2  scans  of  the  19, 22 
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and  37  GHz  channels.  The  complete  cycle  takes  28  seconds  and  it  must  be  complete  to 
process  the  data  (NGDC,  1998b). 


Figure  3:  SSM/I  Scan  Geometry  (Adapted  from  Hollinger,  1983) 

As  we  can  see  in  figure  4,  the  SSM/I  scan  frequencies  were  chosen  for  a  reason. 
The  19.3,  37  and  85.5  GHz  channels  are  in  electromagnetic  “window”  regions,  while  the 
22.2  GHz  channel  is  in  the  middle  of  a  water  vapor  absorption  band  and  is  thus  highly 
sensitive  to  changes  in  water  vapor  content.  In  addition,  the  85.5  GHz  channel  is  highly 


sensitive  to  scattering  and  can  be  used  to  detect  scattering  patterns  associated  with 
rainfall  (Conner  and  Petty,  1998). 


Wav*l#rt£th  (cm) 


Figure  4:  Percentage  Transmission  through  Clear  Atmosphere  to  Space 
(Adapted  from  Ulaby,  1981) 


Now  that  we  have  reviewed  the  DMSP  satellites  and  the  SSM/I  instrument,  we 
will  next  look  at  the  realistic  problem  of  retrieving  land  surface  temperatures  from 
satellite-derived  microwave  brightness  temperatures. 


The  Inverse  Problem 

If  we  take  into  account  all  of  the  factors  we  mentioned  in  our  “Radiative 
Transfer”  primer  earlier,  using  Hollinger  (1983),  Rees  (1990)  and  Ulaby  (1981)  as 
guides,  we  can  incorporate  the  apparent  brightness  temperature  into  a  radiative  transfer 
equation  (RTE),  which  relates  the  measured  brightness  temperature  at  a  certain  frequency 
to  the  desired  surface  temperature.  The  equation  is  rather  involved  (Hams,  1998): 
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(6) 


H  H 

TV(H)  =  tTte-  +  rT^e1'  +  \ic,T.^e"dz  +  (  \k ,T .^„e' dz)e- 


where 

T  -  Apparent  brightness  temperature  at  satellite  height  H 

e  =  Emissivity  of  ground 

T0  =  Physical  temperature  of  ground 

e~T  =  Atmospheric  reduction  factor 
x  =  Optical  depth  above  level  z 
r  =  Reflectivity  of  the  ground 

Tsky  =  Brightness  temperature  of  the  “sky”  (i.e.  extraterrestrial  radiation  sources) 
kc  =  Atmospheric  extinction  coefficient  at  level  z 

Ta_down  and  Ta_up  =  Physical  temperature  of  an  atmospheric  layer  of  thickness  dz  at  level  z 


In  other  words,  the  RTE  shows  that  the  apparent  brightness  temperature  at  the  sensor,  is  a 
function  of  (a)  the  brightness  temperature  contribution  by  the  surface,  i.e.  the  emissivity 
times  the  blackbody  brightness  temperature,  further  reduced  by  atmospheric  attenuation; 
(b)  extraterrestrial  background  radiation,  reflected  by  the  earth  and  twice  attenuated  by 
the  earth’s  atmosphere  (once  in  and  once  out);  (c)  upward  atmospheric  emission, 
attenuated  by  the  portion  of  the  atmosphere  between  the  emission  point  and  the  sensor; 
and  (d)  downward  atmospheric  emission,  reflected  by  the  earth  and  attenuated  by  the 
appropriate  amount  of  atmosphere. 

Clearly,  if  we  knew  a,  b,  c,  and  d,  we  could  easily  calculate  the  apparent 
brightness  temperature.  The  problem  lies  in  that  we  know  the  apparent  brightness 
temperature  and  want  to  find  the  surface  temperature.  Such  an  “inverse”  problem  does 
not  yield  itself  to  straightforward  analytical  solution  (Ulaby,  1986).  Therefore,  other 
methods  such  as  statistical  methods  must  be  employed. 
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The  General  Linear  Model 


Though  we  could  theoretically  solve  our  problem  numerically,  there  are  too  many 
variable  parameters  to  do  so  experimentally.  However,  we  do  not  need  to  solve  the  RTE. 
Instead,  we  can  gather  a  lot  of  brightness  temperatures  and  corresponding  surface 
temperatures.  This  abundance  of  data  lends  itself  quite  well  to  statistical  methods  such  as 
Multiple  Linear  Regression  (Neter  et  al.,  1983). 

The  general  linear  regression  model  assumes  (a)  there  exists  a  linear  relationship 
between  the  random  variable  Y  (surface  temperature)  and  a  linear  combination  of  other 
variables  (channel  brightness  temperature),  and  (b)  the  value  of  Y  minus  the  expected 
value  of  Y  (or  residual)  can  be  represented  by  a  normal  distribution  of  mean  0  and  some 
set  standard  deviation  (Neter  et  al.,  1983).  In  the  case  of  this  research,  a  linear  brightness 
temperature-surface  temperature  relationship  and  Gaussian  distribution  of  temperature 
deviation  is  reasonable  (Ulaby,  1986).  Further,  the  large  number  of  data  sets  analyzed  in 
this  research  means  we  can  invoke  the  Central  Limit  Theorem,  which  allows  us  to 
assume  a  Gaussian  distribution  of  parameters  not  taken  into  account  in  the  regression 
model  (Wilks,  1993),  thus  allowing  us  to  satisfy  assumption  (b). 

Given  these  assumptions,  we  can  view  the  relationship  between  a  single  surface 
temperature  observation  and  the  seven  corresponding  channel  brightness  temperatures  as 
given  by  the  following  equation: 

Y  =  A  +A^i  +  Pixi  +  •••  +  +£i  (T) 
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An  easier  way  to  look  at  the  above  equation  is  in  matrix  format.  Thus  if  we  define 
vectors  Y,  (3,  and  s;  and  a  matrix  X  as  follows: 


X 

'1 

xn  - 

X„~ 

A 

V 

Y  = 

y2 

x  = 

1 

*21  ••• 

X21 

fi= 

fix 

S- 

1 

X*  - 

*„7_ 

A. 

we  could  rewrite  equation  (7)  as  (Neter  et  al.,  1983) 

Y  =  X  p  +  e  (8) 


where 

Y  is  a  vector  of  surface  temperature  observations 
p  is  a  vector  of  coefficients 
X  is  a  matrix  of  brightness  temperature  values 

s  is  a  vector  of  independent  normal  random  variables  with  expected  value  of  0  and 
variance  ct2I  (where  I  is  the  Identity  Matrix) 


Knowing  P,  we  can  retrieve  Y  (the  vector  of  surface  temperatures)  from  X  (the 
vector  of  brightness  temperatures).  Unfortunately,  we  cannot  use  equation  (7)  to 
calculate  the  “true”  coefficients  because  we  do  not  know  what  the  error  values  are. 
Fortunately,  we  can  use  the  method  of  least  squares  (Devore,  1995)  to  estimate  the 
regression  coefficients  p.  In  matrix  form,  Neter  describes  the  calculation  of  the  vector  of 
estimated  coefficients  b  (an  estimate  of  P)  as: 


b  =  (XTX)  _1  XtY 


(9) 
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In  our  experiment,  we  are  dealing  with  thousands  of  data  sets  and  seven 
brightness  temperature  channels  to  regress.  Inverting  and  transposing  such  matrices  by 
hand  would  be  highly  cumbersome.  Fortunately,  there  exist  many  software  packages  to 
make  these  calculations  for  us.  The  software  package  I  used  was  S-Plus  4.5  by  Mathsoft, 
Inc.  (1997). 

Stepwise  Linear  Regression 

Whenever  possible,  it  is  desirable  to  remove  variables  from  a  regression  in  order 
to  simplify  the  math  involved.  When  the  contribution  of  a  certain  variable  X  only  gives  a 
marginal  contribution  to  the  multiple  R-squared  value  (for  further  explanation,  see 
section  13.4  of  Devore,  1995)  of  the  regression  estimate,  that  variable  can  be  removed 
without  significantly  reducing  the  accuracy  of  the  regression.  One  of  the  most  common 
methods  of  removing  variables  in  this  manner  is  stepwise  linear  regression.  The  method 
of  stepwise  regression  develops  a  series  of  regression  models,  at  each  step  adding  or 
deleting  a  variable  (Neter  et  al.,  1983).  Fortunately,  the  S-Plus  software  has  the 
capability  to  perform  stepwise  regression. 

It  is  important  to  note,  however,  that  stepwise  linear  regression  has  its  limitations. 
Among  the  most  significant  to  our  research  is  that  stepwise  regression  will  occasionally 
arrive  at  an  unreasonable  subset  of  variables  when  the  overall  variable  set  is  highly 
correlated  (Neter  et  ah,  1983).  Since  the  full  set  of  brightness  temperatures  can  have 
correlation  as  high  as  .9  at  times  (McFarland,  1991),  it  will  be  important  to  keep  an  eye 
on  the  “aptness”  of  the  model.  Among  the  methods  used  to  test  regression  aptness  is  to 
check  that  the  residuals  do  not  deviate  significantly  from  normality  (Neter  and 
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Wasserman,  1974).  This  can  be  done  with  a  number  of  plots,  which  are  available  in_Sb 
Plus.  Such  plots  of  the  regressions  performed  in  this  research  will  be  shown  in  Chapter 
IV  and  Appendix  B. 

Antenna  Temperature  vs.  Brightness  Temperature 

The  SSM/I  sensors  do  not  detect  brightness  temperature  directly.  Rather,  the 
sensors  measure  what  is  known  as  antenna  temperature.  In  order  to  explain  this,  we  must 
delve  briefly  into  antenna  theory. 

The  best  way  to  think  of  a  passive  antenna  is  as  a  wire.  It  functions  to  guide 
electromagnetic  waves  along  itself  (Ulaby,  1981).  The  radiometer  works  in  that  it 
measures  the  voltage  change  in  various  frequencies  along  the  antenna.  Given  a 
bandwidth  and  the  physical  characteristics  of  the  antenna,  the  voltage  counts  transmitted 
by  the  satellite  can  be  translated  into  brightness  temperatures.  For  the  SSM/I  instrument, 
Air  Force  Weather  Agency  (AFWA),  Offutt  AFB,  translates  the  raw  counts  into 
brightness  temperatures  and  places  them  into  a  Sensor  Data  Record  (SDR).  It  is 
important  to  keep  in  mind  that  the  brightness  temperatures  studied  did  not  come  directly 
from  the  satellite  but  were  first  converted  from  voltage  counts;  this  introduces  another 
possible  source  of  error  into  the  research,  calibration  error. 

The  Calibration/V alidation  (CV)  Algorithm 

The  algorithm  studied  in  this  research  is  the  CV  algorithm.  For  a  given  “scene,” 
this  routine  first  looks  at  relationships  between  the  seven  channel  brightness  temperatures 
in  order  to  determine  a  land  type.  The  brightness  temperature  ranges  relevant  to  the 
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respective  land  types  were  determined  experimentally  by  observing  channel  brightness 
temperature  relationships  over  known  land  types  (Hollinger,  1983).  The  CV  program 
first  separates  data  into  8  different  land  types:  dry,  arable  soil;  moist  soil;  semidesert; 
desert;  dense  vegetation;  mixed  water  and  vegetation;  light'vegetation,  and  wet  soil. 
Following  is  an  example  of  a  land  type-sorting  algorithm  for  the  “dry,  arable  soil”  land 
type: 


If  (V 22- V19)  <  4.0  and 
4.0  <  (F19  +  F37)  /  2  <  9.8  and 
(F37-F19)  >-6.5  and 
-  5.0  <  (F85  -  F37)  <  0.5  and 
(7785  -HZ!)  <4.2  then 

land  type  =  dry  arable  soil 

Where  variables  are  by  polarization  and  frequency,  e.g.  “V22”  is  the  brightness 
temperature  measured  for  vertical  polarization  at  22  GHz 

In  the  original  CV  algorithm,  these  land  types  were  then  recombined  into  four  more 
general  land  types:  dense  vegetation,  agricultural  /  range,  moist  soils,  and  dry  soils. 
Multiple  linear  regression  was  performed  and  coefficients  were  calculated  for  each  land 
type  (McFarland,  1991).  In  this  research,  we  will  concentrate  on  the  eight  original  land 
types,  plus  the  data  points  for  which  the  CV  algorithm  could  not  determine  a  land  type. 

Indicator  Variables  and  Qualitative  Regression 

This  research  investigates  two  different  sets  of  data:  one  set  consisting  of  data 
from  the  F10  and  FI 3  DMSP  satellites,  and  another  set  from  the  FI 4  DMSP  satellite. 
Part  of  the  research  involves  determining  if  there  is  a  statistically  significant  difference 
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between  the  regression  coefficients  obtained  for  the  two  data  sets.  The  simplest  method 
to  perform  such  a  qualitative  regression  is  the  use  of  an  indicator  variable  (Neter  et  al., 
1983). 

An  indicator  variable  is  a  variable  that  identifies  a  category  of  data.  In  this  case, 
our  categories  are  F10/F13  data  and  F14  data.  For  our  research,  we  may  create  a  variable 
with  the  values  0  and  1:  0  for  the  FI 0/F 13  data  and  1  for  the  F 14  data.  Regression  can 
then  be  performed  on  the  combined  (F10/F13/F14)  data  set,  including  the  indicator 
variable  with  the  other  relevant  variables  in  the  regression.  The  Null  hypothesis  will  be 
that  the  coefficient  of  the  indicator  variable  in  the  regression  will  be  zero,  i.e.  the 
regression  functions  of  the  two  data  sets  are  identical.  The  Alternative  hypothesis  will  be 
that  the  coefficient  of  the  indicator  variable  will  not  be  zero.  A  standard  t-test  or  partial-F 
test  can  be  performed  to  determine  whether  or  not  to  reject  the  Null  hypothesis.  For 
further  information  on  the  use  of  indicator  variables,  see  Chapter  10  of  Neter  et  al. 

(1983). 

Earlier  in  this  chapter,  most  of  the  sources  of  error  were  introduced.  However,  it 
is  always  beneficial  to  have  listed  potential  sources  of  error  in  the  same  place.  In  the  next 
section,  these  sources  will  be  summarized. 

Sources  of  Error 

Given  the  sizes  of  the  SSM/I  channel  “footprints,”  the  most  glaring  source  of 
error  is  the  potential  mixture  of  land  types  in  a  single  SSM/I  measurement.  The  land  type 
determination  of  the  CV  algorithm  is  far  from  exact,  but  even  if  it  were  exact,  brightness 
temperatures  are  taken  over  areas  greater  than  100  square  kilometers.  A  number  of 
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varying  land  types  viewed  within  a  given  pixel  will  lead  to  several  surface  emissivities 
contributing  to  a  single  measurement. 

Another  source  of  error  is  that  we  are  using  surface  temperature  observations  to 
correlate  with  satellite-measured  brightness  temperatures.  These  two  measurement 
methods  do  not  always  give  consistent  results.  For  example,  the  ground-measured  air 
temperature  trends  since  1979  differ  significantly  from  those  observed  by  satellite-borne 
PMR  (Hurrell  and  Trenberth,  1996).  This  difference  of  instrumentation  may  introduce  a 
systematic  error  to  our  results. 

Another  source  of  error  is  the  assumption  of  a  linear  relationship  between 
brightness  temperature  and  surface  temperature.  While  the  Rayleigh-Jeans 
approximation  lends  validity  to  the  assumption  of  a  linear  relationship  between  terrestrial 
microwave  radiation  and  land  surface  temperature,  it  only  applies  to  one  part  of  the  RTE. 
There  are  contributions  from  three  other  parts  of  the  RTE  -  two  components  due  to 
atmospheric  microwave  radiation  (one  upward  radiation  to  the  sensor,  and  one  downward 
radiation  reflected  by  the  earth),  and  one  component  due  to  reflected  extraterrestrial 
microwave  radiation.  These  contributions  are  not  necessarily  linear,  which  adds  a 
nonlinear  aspect  to  the  problem. 

A  fourth  source  of  error  is  the  regression  itself.  Even  if  the  brightness 
temperature-surface  temperature  relationship  were  exactly  linear,  because  of 
measurement  error,  the  regression  can  only  calculate  an  approximation  of  the  actual,  ideal 
coefficients  (Neter  et  al.,  1983).  However,  the  larger  the  regressed  data  set  is,  the  more 
accurate  the  approximation  to  the  true  coefficients  will  be. 
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A  fifth  potential  source  of  error  is  calibration  error.  If  the  calibration  was  not 
performed  correctly,  the  conversion  from  radiation  counts  to  brightness  temperatures 
could  introduce  a  systematic  error  to  the  calculations.  It  has  been  noted  that  different 
calibration  averaging  techniques  and  other  processing  can  result  in  different  antenna 
temperatures  being  calculated  from  the  same  radiation  counts  (Ritchie  et  al.,  1998).  The 
accuracy  of  the  algorithm  used  to  translate  SSM/I  radiation  counts  to  brightness 
temperatures  is  beyond  the  scope  of  this  research  and  will  not  be  examined. 

Still  another  source  of  error  is  the  accuracy  of  the  land  type  algorithm  itself. 

While  diagnosis  of  the  land  type  sorting  will  not  be  performed  in  this  research,  it  is 
important  to  note  that  an  erroneous  land  type  diagnosis  will  introduce  error.  Such 
miscategorization  could  be  caused  by  a  number  of  factors,  not  the  least  of  which  the 
presence  of  hydrometeors. 

The  effects  of  precipitation  on  microwave  emissions  are  twofold:  first,  a  higher 
liquid  water  amount  translates  to  a  higher  brightness  temperature,  especially  in  the  19 
GHz  range.  Second,  scattering  by  large  ice  particles  at  high  frequencies  (e.g.  in  the  85 
GHz  range)  reduces  the  amount  of  radiation  reaching  the  satellite  (Liu  and  Curry,  1998). 

The  final  source  of  error  is  the  error  universal  to  any  experiment  -  noise.  From 
noise  in  the  radiometer  measurements,  to  noise  in  the  measurements  of  surface 
temperature,  instrumentation  error  will  add  some  variability  to  in  the  results.  If  this  noise 
is  of  a  random  nature,  it  will  conform  to  a  Gaussian  distribution  and  be  averaged  out  of 
the  results  through  linear  regression. 
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Related  Research 


An  initial  search  for  research  on  PMR-derived  land  surface  temperatures  only 
yielded  the  previous  work  done  at  AFIT  (Harris,  1998;  and  Comoglio,  1997).  While  no 
additional  research  concerning  SSM/I-derived  land  surface  temperatures  was  found,  there 
were  several  articles  that  provided  background  on  PMR,  possible  sources  of  error,  and 
other  uses  for  PMR.  One  richly  researched  example  was  use  of  PMR  to  determine 
rainfall  rates. 

Hurrell  and  Trenberth  (1996)  examined  the  differences  in  observed  trends 
between  near-global  monthly  mean  surface  temperature  anomalies  and  those  of  global 
Microwave  Sounding  Unit  2R  (MSU  2R)  temperatures  for  1979-1995.  The  variability  of 
surface-observed  temperatures  was  found  to  be  small  over  the  oceans  but  large  over  the 
land,  while  MSU  2R  measured  variations  that  were  much  more  zonally  symmetric.  Also, 
the  two  measurement  systems  gave  greatly  dissimilar  responses  to  volcanic  eruptions,  the 
El  Nino  Southern  Oscillation  (ENSO),  and  changes  in  stratospheric  ozone.  Hurrell  and 
Trenberth  conclude  that  most  of  the  differences  can  be  attributed  to  physical  differences 
between  the  two  measurement  techniques.  They  emphasize  that  neither  technique  is 
more  correct,  rather  that  both  techniques  give  a  different  perspective  on  the  same  events. 

Ritchie  et  al.  (1998)  investigated  differences  in  PMR-derived  rainfall-rate 
products  from  NGDC  and  Fleet  Numerical  Meteorology  and  Oceanography  Center 
(FNMOC),  which  used  identical  SSM/I  data.  The  differences  were  traced  to  different 
calibration  averaging  techniques  and  other  processing  methods,  which  yielded  different 
antenna  temperatures  from  the  same  data  sets.  The  effects  of  these  temperature 
differences  were  then  examined  by  generating  rain  rates  using  the  Goddard  Scattering 
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Algorithm.  The  conclusion  was  that  it  was  possible  to  infer  greater  rain  rates  from  a  cold 
bias  introduced  in  antenna  temperature  processing.  Conversely,  rain  rates  would  be 
lower  if  the  antenna  temperature  processing  yielded  a  warm  bias. 

Liu  and  Curry  (1998)  examined  the  effects  of  hydrometeors  on  microwave 
emissions.  In  their  research,  they  attempted  to  determine  if  a  relationship  existed 
between  polarization  differences  (D)  at  19  GHz  and  polarization  corrected  temperature 
(PCT)  at  85  GHz.  5°  latitude  X  5°  longitude  regional  means  of  these  parameters  over 
global  oceans  were  calculated  for  areas  of  no  precipitation,  light  precipitation,  and  heavy 
precipitation.  In  the  case  of  no  precipitation,  a  small  variation  of  PCT  could  be  achieved 
by  changing  the  weights  given  to  the  polarization  brightness  temperatures  in  the  85  GHz 
channel.  For  light  precipitation,  the  relationship  between  D  and  PCT  was  latitude 
dependent.  No  clear  latitudinal  dependence  was  found  for  heavy  precipitation.  Liu  and 
Curry  conclude  that  the  value  of  the  D-PCT  slope  can  be  used  to  help  categorize 
precipitation  types,  which  may  be  useful  in  determining  a  specific  algorithm  best  used  for 
precipitation  type. 

Conner  and  Petty  (1998)  compared  SSM/I  rain  rate  retrieval  methods  over  the 
Continental  United  States.  The  researchers  compared  three  experimental  rain-rate 
algorithms  (sorting  by  land  type)  with  two  existing  SSM/I  rain  rate  algorithms,  using 
hourly  rain  gauge  reports  and  10-cm  radar  data  for  ground  truth.  Results  of  the  research 
were  inconclusive:  the  five  algorithms  yielded  similar  results,  with  no  algorithm  showing 
itself  to  be  superior. 

Kidd  et  al.  (1998)  reviewed  the  performance  of  rainfall  rate  retrieval  algorithms 
established  by  statistical  relationships  and  empirical  calibration.  The  advantages  of 
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statistical-empirical  algorithms  were  found  to  outweigh  the  disadvantages,  though 
developers  are  aware  of  the  limitations  of  such  algorithms.  Physically-based  algorithms, 
they  conclude,  are  not  likely  to  improve  until  the  intricacies  of  the  physical  relationships 
are  better  understood. 

Ferraro  et  al.  (1998)  set  forth  a  methodology  to  screen  land-related  effects  from 
SSM/I  precipitation  retrieval  algorithms.  They  found  the  complicated  interaction  of 
earth-emitted  microwave  radiation  with  various  land  types  and  atmospheric  variations 
made  the  development  of  a  single  global  screening  methodology  very  difficult.  For  this 
reason,  they  recommend  development  of  screening  methodologies  on  a  regional  basis. 

Wentz  and  Spencer  (1998)  developed  a  physically-based  algorithm  for  retrieving 
rain  rates  from  SSM/I  measurements  over  oceans.  The  algorithm  uses  a  beamfilling 
correction  based  upon  liquid  water  absorption  coefficients  at  37  GHz  and  19  GHz  to 
correct  the  underestimation  of  rainfall  rates  by  other  physically-based  techniques.  The 
algorithm  simultaneously  calculates  wind  speed,  columnar  water  vapor  and  liquid  water 
content,  rain  rate,  and  effective  radiating  temperatures  for  upwelling  radiation.  The  root 
mean  square  difference  between  retrieved  water  vapor  value  and  radiosonde-measured 
value  was  5  mm.  The  algorithm  was  still  found  to  underestimate  rainfall  rates  in  the 
tropics. 

Documentation  on  the  SSM/I  sensor  by  Hollinger  (1983)  contained  a  summary 
description  of  the  SSM/I  instrument,  as  well  as  descriptions  of  the  geophysical  models, 
the  interaction  model,  the  retrieval  technique,  and  the  climatology  used  for  the  SSM/I 
environmental  retrieval  algorithm.  McFarland  (1991),  who  investigated  the  brightness 
temperature  behavior  and  polarization  differences  among  various  land  types,  conducted 
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the  research  that  led  to  the  CV  algorithm.  McFarland  used  SSM/I  brightness 
temperatures  obtained  for  specific  locations  on  earth  of  a  known  land  type.  From  these 
data,  he  derived  a  set  of  logical  if-and-then  statements  (including  the  sample  statements 
shown  earlier  in  this  chapter)  for  determination  of  14  distinct  land  types  (9  of  which  are 
still  used  by  AFWA  in  the  CV  algorithm)  from  the  7  channel  brightness  temperature 
measurements.  McFarland  next  used  multiple  linear  regression  to  calculate  algorithms  to 
correlate  SSM/I  measurements  and  land  surface  temperatures  for  each  of  four  general 
land  type  categories.  For  this  portion  of  the  research,  which  spanned  four  days  in  August 
1987,  SSM/I  readings  from  the  F8  satellite  over  the  US  Western  Desert  and  Central 
plains  were  matched  with  high/low  temperature  observations  from  the  federal 
climatological  network. 

Other  research  found  of  relevance  to  this  thesis  was  a  paper  investigating  land 
surface  temperature  determination  from  satellites  (Prata,  1994),  in  which  the  Advanced 
Very  High  Resolution  Radiometer  (AVHRR)  and  the  Along  Track  Scanning  Radiometer 
(ATSR)  were  used  to  compare  IR  emissions  with  surface  temperatures  over  Continental 
Australia.  The  work  concluded  that  an  algorithm  that  took  into  account  climatological 
temperature  and  water  vapor  profiles  could  yield  accurate  temperature  measurements. 

Schmugge  and  Schmidt  (1998)  also  used  an  AVHRR  sensor  aboard  the  NOAA-9 
satellite  to  measure  the  surface  temperature  during  the  First  ISLSCP  (International 
Satellite  Land  Surface  Climatology  Project)  Field  Experiment  (FIFE),  conducted  over 
grassland  terrain  in  central  Kansas  in  1987.  The  AVHRR-derived  values  were  corrected 
for  atmospheric  effects  and  compared  to  broadband  temperature  readings  at  10  sites  and 
to  the  thermal  channel  of  an  NS001  sensor  aboard  a  C-130  aircraft.  The  AVHRR  values 
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were  found  to  be  5°  to  6°C  wanner  than  the  average  of  the  ground  measurements.  The 
researchers  attributed  the  difference  to  the  location  of  ground  measurements  being 
skewed  toward  well-vegetated  surfaces. 

Betts  and  Ball  (1998)  performed  a  study  confirming  earlier  research  regarding  the 
diurnal  and  seasonal  variation  of  surface  albedo.  One  of  the  findings  was  that  soil  heat 
flux  is  reduced  at  night  when  soil  is  drier. 

Summary 

Measurement  of  microwave  emissions  to  determine  land  surface  temperature  is 
feasible  but  fraught  with  complications.  Use  of  the  Rayleigh-Jeans  approximation  can 
simplify  the  problem,  but  one  must  remain  mindful  of  the  extent  to  which  the  earth  does 
not  conform  to  required  assumptions.  The  radiative  transfer  equation  relates  surface 
temperature  to  various  brightness  temperatures,  but  the  equation  cannot  be  solved  easily 
due  to  its  inverse  nature.  Given  the  large  amount  of  data  present,  statistical  methods 
such  as  multiple  linear  regression  can  be  used  to  estimate  coefficients  for  a  solution. 

The  CV  algorithm  sorts  data  by  land  type,  which  can  make  regression  more 
accurate.  To  compare  data  from  two  different  satellites,  an  indicator  variable  can  be 
generated  and  regressed  to  determine  if  the  regression  functions  are  identical.  Sources  of 
error  are  many,  but  with  luck  can  be  quantified  via  regression.  Finally,  very  little  recent 
research  has  been  done  on  using  SSM/I  to  calculate  land  surface  temperatures,  other  than 
that  performed  at  AFIT  by  Flarris  and  Comoglio. 
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TIT.  Methodology 


Chapter  Overview 

This  chapter  explains  the  processing  of  the  data  into  meaningful  results.  Satellite 
brightness  temperature  channels  were  consolidated  into  sets  by  time  and  location  and 
then  compared  to  surface  temperature  observations  taken  nearby  at  comparable  times. 
These  matched  sets  were  then  exported  into  files,  consolidated  by  season,  and  sorted  by 
CV  to  determine  land  types.  Finally,  data  was  exported  to  a  PC  and  processed  using 
statistical  software. 

The  following  sections  will  first  touch  upon  the  work  done  by  others  at  AFIT 
(Harris,  1998;  and  Comoglio,  1997)  and  explain  how  the  data  they  acquired  was  modified 
to  fit  the  purposes  of  this  research.  There  will  follow  a  detailed  discussion  of  the 
matching  and  sorting  process  performed.  At  the  end  of  the  chapter,  there  will  be  a 
discussion  of  the  regression  techniques,  both  quantitative  and  qualitative,  used  to 
determine  the  results  of  the  research. 

Past  Work  at  AFIT 

While  the  research  performed  in  this  thesis  is  not  directly  related  to  the  work  of 
Harris  (1998)  and  Comoglio  (1997),  both  the  data  they  gathered  and  the  FORTRAN  code 
they  wrote  were  quite  useful  in  its  successful  completion.  Comoglio  started  with 
preliminary  research  comparing  the  effectiveness  of  the  CV  algorithm  with  the  TMPSMI 
(TS)  algorithm.  Due  to  problems  decoding  the  data,  he  was  unable  to  make  a  complete 
comparison  of  the  two  algorithms.  Taking  up  where  Comoglio  left  off,  Hams  gathered 
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additional  data  and  completed  the  study,  in  which  he  concluded  the  CV  algorithm  was 
more  accurate.  Among  the  data  gathered  by  Harris  and  Comoglio  were  sets  of  SSM/I 
readings  decoded  into  brightness  temperatures  by  AFWA  (known  as  Sensor  Data 
Records  or  SDR)  and  surface  observations  (acquired  from  the  Air  Force  Combat 
Climatology  Center  [AFCCC])  for  Bosnia,  the  continental  United  States,  Saudi  Arabia, 
and  the  Korean  Peninsula.  SDR  readings  from  the  F13  DMSP  satellite  were  recorded 
over  the  four  areas  during  August  1996,  October  1996,  and  January-February  1997. 
Harris  also  acquired  SDR  from  both  the  FI  3  and  F14  satellites  for  the  period  of  late 
August  -September  1997,  but  he  did  not  have  the  time  to  decode  this  data  for  his  study. 
The  data  used  in  this  study  is  summarized  in  Table  1  below. 

Harris  and  Comoglio  also  wrote  a  number  of  useful  programs  and  algorithms  to 
analyze  the  data.  While  I  was  only  able  to  use  one  of  their  programs  directly,  I  was  able 
to  adapt  a  number  of  their  algorithms  to  accomplish  important  tasks  in  my  efforts.  Thus, 
it  can  be  said  that  my  work  was  very  much  a  team  effort  in  concert  with  the  efforts  of 
Harris  and  Comoglio. 
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Table  1 :  Data  Matches  Used  in  Research 


Satellite(s) 

Date 

#  of  Matches 

FI  0/F1 3 

1 6-Oct-96 

90 

FI  0/F1 3 

1 7-Oct-96 

68 

FI  0/F1 3 

1 8-Oct-96 

229 

F10/F13 

21 -Oct-96 

223 

FI  0/F1 3 

22-Oct-96 

88 

F10/F13 

25-Oct-96 

137 

FI  0/F1 3 

29-Oct-96 

233 

FI  0/F1 3 

31 -Oct-96 

230 

FI  0/F1 3 

04-Nov-99 

270 

FI  0/F1 3 

05-Nov-96 

208 

F14 

22-Aug-97 

136 

F14 

27-Aug-97 

211 

F14 

28-Aug-97 

227 

F14 

02-Sep-97 

195 

F14 

04-Sep-97 

243 

F14 

1 6-Sep-97 

239 

F14 

1 7-Sep-97 

192 

F14 

1 8-Sep-97 

74 

F14 

1 9-Sep-97 

134 

F14 

22-Sep-97 

107 

F14 

23-Sep-97 

137 

F14 

25-Sep-97 

165 

F14 

26-Sep-97 

203 

F14 

30-Sep-97 

318 

Data  Description 

The  SSM/I  SDR  data  was  recorded  on  8mm  tapes  by  AFWA.  The  files  were 
organized  such  that  every  day  of  observation  was  in  a  separate  directory.  Inside  each 
directory  were  nine  binary  files:  one  for  each  of  the  seven  channels,  one  for  the  date  time 
group  of  the  satellite  pass,  and  one  for  the  identity  of  the  satellite  (See  Table  2).  Each  file 
contained  readings  for  each  64  x  64  grid  cell  within  each  of  the  64  polar  stereographic 
grid  boxes  (or  4096  grid  cells)  which  cover  the  entire  Northern  Hemisphere  (See  Figure 
5). 
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Table  2:  Satellite  Data  File  Names 

File  Type  File  Name 

Grid  Date  Time  Stamp  RNXMI100MITT 

19  GHz,  Vertical  Polarization  RNXMI1_00MIH1 

19  GHz,  Horizontal  Polarization  RNXMI1_00MIV1 

22  GHz,  Vertical  Polarization  RNXMI1_00MIV2 

37  GHz,  Horizontal  Polarization  RNXMI1_00MIH3 

37  GHz,  Vertical  Polarization  RNXMI1_00MIV3 

85  GHz,  Horizontal  Polarization  RNXMI1_00MIH8 

85  GHz,  Vertical  Polarization  RNXMI1_00MIV8 

Satellite  Identifier  RNXMI1_00MIID 


Figure  5:  Polar  Stereographic  Northern  Hemisphere  Grid 
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The  surface  temperature  data  from  AFCCC  was  sent  in  compressed  format  on  3  !4 
inch  disks.  Each  line  consisted  of  a  series  of  numbers,  representing  the  World 
Meteorological  Organization’s  (WMO)  station  number;  the  year,  month,  and  day  of  the 
observation;  the  hour  and  minute  of  the  observation  in  Universal  time;  the  surface 
temperature  in  Kelvin;  and  the  longitude  and  latitude  of  the  station  (See  Table  3). 


Table  3:  Example  of  Data  Format 


STATION  YEAR  MO  DAY  HR  LAT  LON 


470050 

1996 

10 

470050 

1996 

10 

470050 

1996 

10 

470050 

1996 

10 

470050 

1996 

10 

01  0300  41.817  128.317 
01  0600  41.817  128.317 
01  0900  41.817  128.317 
01  1200  41.817  128.317 
01  1500  41.817  128.317 


TEMPK 

289.66 

291.66 

286.36 
280.46 

279.36 


Decoding,  Combining,  Matching,  and  Sorting  the  Data 

The  four  phases  of  transforming  disparate  data  files  into  a  group  of  usable 
matched  data  sets  are:  decoding,  combining,  matching,  and  sorting  (See  Figure  6  for  a 
flow  chart  which  outlines  the  four  step  process).  For  F10/F13  data  already  analyzed  by 
Harris  (1998),  the  first  phase  was  already  accomplished;  the  August  1996,  October  1996, 
and  January/February  1997  SSM/I  SDR  files  were  already  decoded  into  ASCII  files  from 
binary  files.  The  following  few  paragraphs  outline  the  data  analyzing  process,  first  for 
the  three  seasons  outlined  above,  and  then  for  the  late  August/September  1997  data  not 


previously  analyzed. 
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bindecode.f 


landsort.f 


Figure  6:  Methodology  Flow  Chart 


In  the  data  already  analyzed  by  Harris  (1998),  some  of  the  groundwork  had 
already  been  done.  However,  Harris  compared  data  pairs  of  CV  or  TS  algorithm 
temperature  with  surface  temperature,  whereas  this  research  matched  data  sets  of  the 
seven  channel  brightness  temperatures  with  corresponding  surface  temperatures.  Further, 
while  Harris  implicitly  sorted  by  land  type  when  he  used  the  CV  and  TS  source  code  to 
calculate  the  algorithm  temperatures,  he  had  no  need  to  sort  the  output  by  land  type  to 
accomplish  his  goals.  Thus,  while  this  research  used  the  same  raw  data,  it  needed  to  be 
manipulated  in  a  substantially  different  manner  than  previous  research  had  done 

Nevertheless,  the  F10/F13  SSM/I  SDR  data  having  already  been  decoded  gave  the 
research  a  considerable  boost.  For  the  August  1996,  October  1996,  and  January/February 
1997  data,  the  binary  data  had  already  been  decoded  into  ASCII.  Thus,  for  this  data, 
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phase  one  was  already  complete.  For  phase  two,  a  short  FORTRAN  program  was  written 
to  combine  the  seven  separate  channel  temperature  files  for  each  day  into  one  file  per 
day.  The  record  format  of  the  new  daily  file  was:  seven  channel  brightness  temperatures, 
followed  by  the  three  grid  coordinates,  referred  to  hereafter  as  “i”,  “j”,  and  “k” 
coordinates. 

The  third  phase  was  to  match  the  sets  of  channel  temperatures  to  a  surface 
observation  nearby  in  both  space  and  time.  For  determination  of  spatial  proximity,  the 
earlier  criterion  of  Harris  (1998)  and  Comoglio  (1997)  were  used:  the  surface  station  and 
SSM/I  measurement  coordinates  must  be  no  more  than  1  grid  point  apart  (see  figure  7). 
For  ease  of  calculation,  so-called  “supemeph”  coordinates  were  used  -  that  is,  the  “i,”  “j” 
and  “k”  coordinates  converted  into  512  x  512  “SI”  and  SJ”  coordinates.  For  the  surface 
temperature  observations,  a  FORTRAN  program  written  by  Harris  and  Comoglio  was 
used  to  translate  the  latitudes  and  longitudes  into  “supemeph”  coordinates.  After  the 
surface  observations  were  properly  formatted,  a  new  FORTRAN  program  performed  the 
matching  with  corresponding  brightness  temperatures  by  space  and  time.  While  much  of 
the  code  was  original,  many  algorithms  were  adapted  from  a  previously  existing  program 
(Harris,  1998).  The  matching  program  first  looked  at  the  Julian  day  and  time  of  the 
observation  at  each  grid  point.  If  the  Julian  day  matched  the  day  being  examined  (i.e.  the 
program  automatically  threw  out  any  data  from  passes  on  previous  days),  it  then  opened 
the  observation  file  for  the  appropriate  hour’s  observations.  The  program  then  converted 
the  brightness  temperature  i,  j,  and  k  coordinates  to  SI  and  SJ  grid  coordinates,  and 
compared  them  to  the  SI  and  S  J  coordinates  of  each  observation.  If  the  coordinate  sets 
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were  within  one  grid  point  of  one  another,  the  program  saved  the  matched  data  set  to  a 
match  file.  This  process  continued  until  all  grid  points  had  been  examined  in  this  way. 


Figure  7:  “Supemeph”  grid  over  Malmstrom  AFB,  MT,  ICAO  “GFA” 

(From  Harris,  1998) 

Once  all  data  had  been  matched,  phase  four  sorted  the  matched  data  by  CV  land 
type.  This  was  done  by  a  FORTRAN  program.  While  much  of  the  code  was  original,  the 
CV  land  sorting  algorithms  were  taken  from  the  existing  calval  FORTRAN  program  (see 
Harris,  1998).  This  program  created  nine  files:  one  for  each  of  the  eight  CV  land  types, 
and  a  ninth  for  those  observations  not  fitting  any  of  the  land  types.  The  specific  land  type 
algorithms  are  listed  in  Appendix  C.  After  the  program  was  run  for  each  day,  a  UNIX 
text  editor  was  used  to  combine  the  daily  data  into  “seasonal”  files:  Summer  1996,  Fall 
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1996,  and  Winter  1997.  (Note:  Harris’s  original  data  was  further  sorted  by  “CONUS” 
and  “overseas”  region.  While  some  analysis  was  done  on  the  regions  separately,  this 
researcher  eventually  decided  not  to  sort  the  data  by  region.)  After  the  data  was 
manipulated  as  described  above,  the  data  sets  were  transferred  to  a  PC,  where  the  files 
were  imported  into  the  statistical  program  “S  Plus”  for  analysis. 

After  the  original  data  was  analyzed,  the  research  turned  to  the  Fall  1997  data  not 
analyzed  before.  Unlike  the  data  described  earlier,  this  data  was  still  in  binary  form. 
Therefore,  the  transformation  process  had  to  start  with  phase  one. 

Having  learned  some  lessons  from  the  original  run  through,  the  process  for 
analyzing  the  F14  data  was  streamlined.  First,  phases  one  and  two  were  accomplished 
with  a  single  FORTRAN  program.  This  program  opened  all  appropriate  files  for  a  given 
day,  then  wrote  the  i,  j,  and  k  coordinates;  day  and  time  of  observation;  the  satellite  ID 
number,  and  the  seven  channel  temperatures  for  each  grid  point.  For  the  purposes  of 
saving  storage  space,  only  the  grid  boxes  over  CONUS,  Saudi  Arabia,  Bosnia,  and  Korea 
were  decoded  and  saved  in  ASCII  format. 

There  were  only  slight  differences  in  the  process  for  phases  three  and  four  for  the 
F14  data.  Synoptic  temperature  observations  were  again  decoded  using  an  existing 
FORTRAN  program  written  by  Comoglio  (1997)  and  Harris  (1998),  which  this 
researcher  modified  slightly. 

The  matching  process  was  again  performed  by  a  FORTRAN  program,  which  was 
only  slightly  modified  for  the  F14  data.  Specifically,  the  match  program  was  rewritten  to 
identify  the  satellite  ED  of  each  reading.  If  the  satellite  ID  number  did  not  equal  48,  the 
ID  number  assigned  to  the  F14  satellite  (Coxwell,  private  communication,  1998),  the 
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matched  data  set  was  not  saved  to  the  appropriate  file.  This  was  done  to  ensure  the  new 
data  would  indeed  consist  entirely  of  brightness  temperature  data  from  the  F14  satellite. 
In  another  lesson  learned  from  the  first  analysis,  the  daily  files  were  merged  into  a  single 
data  file  before  the  data  was  sorted  into  many  different  land  type  files. 

As  was  done  with  the  earlier  data  sets,  a  FORTRAN  program  was  used  to  sort  the 
matched  data  sets  into  land  type  categories.  An  additional  parameter  was  added  to  the 
end  of  each  data  set:  an  integer  “1”  to  identify  the  new  data  as  F14  data.  As  was  done 
with  the  earlier  data,  the  files  were  then  transferred  to  a  PC  for  analysis.  Separate  files 
were  created,  combining  the  F10/F13  data  from  Fall  1996  and  the  F14  data  for  each  land 
type,  and  adding  a  “0”  to  the  F10/F13  data  in  a  new  column  to  match  the  “1”  for  the  F14 
data.  The  data  was  then  ready  for  both  quantitative  and  qualitative  regression. 

Statistics  Used 

This  research  will  use  the  statistical  methods  of  multiple  linear  regression  (MLR) 
and  stepwise  linear  regression.  These  methods  are  outlined  in  the  “General  Linear 
Model”  and  “Stepwise  Linear  Regression”  sections  of  Chapter  II.  The  assumed  “true” 
regression  equation  will  be  in  the  form  of  equation  (7),  specifically  as  follows: 

T,=A+  Am9i  +  Avl9i  +  -  +  A  ^85,.  (+fiindicator  Indicator t )  +  st  (9) 

Where  T;  is  the  synoptic  temperature  observation 

H19j,  V19j,  etc  represent  the  brightness  temperature  values 

Bmdicator  and  Indicator;  represent  the  qualitative  regression  coefficient  and  the 

indicator  variable  respectively. 

s;  is  the  residual  assumed  to  be  taken  from  a  normal  distribution  of  mean  0  and 
variance  s2. 
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In  equation  (9)  above,  the  parenthetical  term  is  included  in  the  qualitative  regression  to 
determine  if  there  is  a  statistically  significant  difference  between  the  F10/F13  data  and 
the  FI 4  data. 

To  analyze  the  quality  of  results  of  the  data  regression,  five  statistics  were  used: 
the  root  mean  square  error  (RMSE,  also  known  as  the  residual  standard  error),  the  root 
mean  square  predictor  error  (RMSPR),  the  multiple  R  squared  statistic,  the  F  statistic,  and 
the  t  statistic.  The  RMSE  was  used  to  evaluate  the  variance  of  the  fitted  temperature 
values  with  respect  to  the  synoptic  temperature  measurements.  The  RMSPR  was  used  to 
cross-validate  regression  on  another  data  set.  The  multiple  R  squared  was  used  to 
evaluate  the  amount  of  variance  explained  by  the  multiple  regression,  i.e.  the  “goodness 
of  fit”  of  the  regression.  The  F  statistic  was  used  to  test  the  validity  of  the  Null 
Hypothesis  that  all  regression  coefficients  were  equal  to  zero.  Finally,  the  t  statistic  was 
used  to  decide  upon  the  validity  of  the  Null  Hypothesis  that  the  true  regression  equations 
of  the  F10/F13  data  set  and  the  F14  data  set  were  identical.  The  latter  determination  is 
important  because  the  primary  goal  of  this  research  is  to  determine  if  CV  brightness 
temperature  coefficients  need  to  be  calculated  separately  for  each  new  DMSP  satellite 
launched. 

A  common  accuracy  measure  in  the  environmental  sciences  is  the  MSE,  or  mean 
squared  error,  which  averages  the  individual  squared  differences  between  a  forecasted 
value  and  an  observed  value  (Wilks,  1995).  In  the  case  of  this  research,  we  can  think  of 
the  fitted  temperature  value  of  the  regression  as  the  “forecast.”  Thus,  if  we  consider  y  the 
fitted  value  and  o  the  value  of  the  synoptic  observation,  the  MSE  for  M  such  data  pairs  is 
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(10) 


M 

MSE=(llM)2(y.-oJ1 


m=\ 


The  statistic  RMSE,  the  statistic  used  in  this  research  is  simply  the  square  root  of  MSE 
(Wilks,  1995). 

The  mean  squared  predictor  error  is  a  means  of  measuring  the  actual  predictive 
capability  of  the  selected  regression  model  by  testing  its  effectiveness  on  another  data  set 
(Neter  et  al.,  1989).  The  MSPR  for  n*  data  sets  in  the  new  (or  validation)  data  set  is 


MSPR  =  (l/n*)fd(yi-oi)2 
1=1 

If  the  MSPR  is  fairly  close  to  the  MSE  based  on  the  regression  fit  to  the  original  (or 
training)  data  set,  then  the  error  mean  square  MSE  for  the  selected  regression  model  is 
not  seriously  biased  and  gives  an  approximate  indication  of  the  predictive  ability  of  the 
model.  If  the  MSPR  is  much  larger  than  the  MSE,  one  should  rely  on  MSPR  to 
determine  how  well  the  selected  regression  model  will  predict  in  the  future.  (Neter  et  al., 
1989) 


The  R  squared  statistic,  also  called  the  coefficient  of  multiple  determination,  is  a 
number  between  0  and  1  which  describes  the  proportion  of  total  variation  “explained”  by 
the  multiple  regression  model  (Devore,  1995),  or  the  “goodness  of  fit”  of  the  regression. 
In  other  words,  it  is  the  degree  to  which  the  variance  of  the  data  is  explained  by  the 
regression  equation.  If  every  data  point  fell  exactly  on  the  regression  line,  the  R-squared 
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value  would  be  1 .  If,  on  the  other  hand,  the  sum  of  the  squares  of  the  distances  from  the 
regression  line  were  not  significantly  smaller  than  the  sum  of  the  squares  of  the  distances 
from  the  overall  mean,  the  R-squared  value  would  be  very  low.  Typically,  if  an  R- 
squared  is  small,  an  analyst  will  usually  want  to  search  for  an  alternative  model  that  can 
more  effectively  explain  data  variation  (Devore,  1995).  While  there  is  no  agreed-upon 
“cutoff’  value  for  R-squared,  I  consider  an  R-squared  =  .6  as  the  minimum  value  to 
consider  the  regression  a  “good  fit.” 

R-squared  is  calculated  numerically  as 

R2  =l-SSE/ SST  (11) 


where 


(12) 


and 


=  (13> 

In  equation  (13),  the  overall  data  mean  value  y  is  subtracted  from  each  predicted  value 
(ym),  whereas  in  equation  (12)  involves  subtracting  each  different  predicted  value  (ym) 
from  the  corresponding  observed  value  (om)(Devore,  1995).  In  other  words,  SSE  is  the 
sum  of  squared  deviations  about  the  regression  line,  while  SST  is  the  sum  of  squared 
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deviations  about  the  mean  value  of  the  entire  data  set.  MLR  will  minimize  SSE,  thereby 
maximizing  R-squared. 

The  F  statistic  is  used  in  a  model  utility  test  which  determines  if  there  is  a  useful 
relationship  between  y  and  the  regressed  predictors  (Devore,  1995).  For  this  research,  F 
will  be  used  to  validate  or  reject  the  Null  Hypothesis 

H0:  A=A=A=-  =  A=  o 

in  favor  of  the  Alternative  Hypothesis 

Ha:  Not  all  true  regression  coefficients  =  0 

F*  is  calculated  for  a  multiple  regression  with  n  data  points  and  k  predictors  as 

F*  =  (R2 /k)/{(l-R2)/[n-k  +  l]}  (16) 

with  the  rejection  level  for  the  Null  hypothesis  (in  the  case  of  this  research,  a  .01 
probability  of  Type  I  Error,  or  the  probability  of  rejecting  the  Null  when  the  Null  is  true) 
being 

I 

F*  >  FmAn_(k+l)  =  Fcrit  (17) 

The  critical  value  F.oi,k,n-(k+i)  can  be  determined  either  with  a  mathematical  program  such 
as  MATHCAD,  or  through  use  of  a  statistical  table.  Figure  8  shows  the  critical  value  of 


(14) 

(15) 
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F  for  7  and  92  degrees  of  freedom,  corresponding  to  MLR  on  100  data  points  and  seven 
predictor  variables. 


X 


Figure  8:  F  distribution  with  7  and  92  degrees  of  freedom. 

Critical  value  Fcrit  is  shown. 

Since  the  numbers  k  and  n  will  vary  among  the  various  data  sets,  I  will  not 
explicitly  calculate  the  critical  F  value  for  each  data  set.  Rather,  I  will  measure  the 
probability  of  the  Null  being  rejected  but  actually  being  true.  This  value,  or  “P-value”,  is 
the  area  under  the  curve  of  the  Null’s  F  distribution  greater  than  the  statistic  F*.  This  can 
be  expressed  numerically  as 

<18> 

F* 

and  can  be  expressed  graphically  as  shown  in  Figure  9. 
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Figure  9:  F  distribution  with  7  and  92  degrees  of  freedom. 
P- value  is  the  area  under  the  curve  to  the  right  of  F*. 


A  P-value  smaller  than  the  desired  probability  of  committing  a  Type  II  error  means  one 
can  reject  the  Null.  Rejecting  the  F-statistic  Null  hypothesis  in  this  research  means  there 
is  some  relationship  between  at  least  one  of  the  brightness  temperatures  and  the  synoptic 
temperature  observation.  Failing  to  reject  the  Null  means  that  there  is  no  relationship 
whatsoever  between  brightness  temperature  and  synoptic  temperature  observation. 

Finally,  the  t  statistic  examines  the  null  hypothesis  that  the  true  regression 
equation  of  one  data  set  is  identical  to  that  of  another  data  set  (Wilks,  1995).  In  the  case 
of  this  research,  it  will  test  the  Null  Hypothesis 

H0:  =°  (19) 


51 


against  the  Alternative  Hypothesis 

Ha  '■  Vindicator  *  0 


(20) 


In  other  words,  rejecting  the  t-statistic  Null  Hypothesis  means  there  is  a  statistically 
significant  difference  between  the  true  regression  equations  of  the  F10/F13  data  and  the 
F14  data.  Failing  to  reject  the  Null  indicates  there  is  no  such  difference  between  the  true 
regression  equations  of  the  F10/F13  data  and  the  F14  data. 

The  software  program  S  plus  tests  these  hypotheses  using  the  t  statistic,  t*.  The 
criterion  for  rejecting  the  null  for  a  rejection  level  of  .01  is 

U*  I—  ^.005,n-(*+1) 

The  critical  value  of  t. oo5,n-(k+i)  could  be  found  in  a  statistical  table  if  it  was  necessary,  but 
a  computer  makes  this  calculation  much  more  quickly.  For  example,  Figure  10  shows  the 
critical  values  for  a  t  statistic  where  n  =  100  and  k  =  7,  (i.e.  there  are  92  degrees  of 
freedom). 


x 


Figure  10:  critical  values  |t*|  >  t.005,92 
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As  was  the  case  with  the  F  statistic,  the  number  of  degrees  of  freedom  will  change  with 
each  data  set.  Thus,  instead  of  carrying  different  critical  values  through  the  results,  I  will 
instead  show  the  “P-value.”  The  only  difference  between  the  P-value  for  the  F  statistic 
and  the  P-value  for  the  t  statistic  is  that,  because  we  are  performing  a  2-sided  test  of  the 
Null,  our  critical  P-value  is  .005  instead  of  .01. 

Summary 

A  four  phase  process  was  used  to  gather  matched  data  sets,  sort  them  by  season 
and  land  type,  and  combine  the  data  used  by  Harris  (1998)  and  Comoglio  (1997)  from 
Fall  1996  with  new  data  from  the  F14  satellite  from  Fall  1997.  Four  statistics  were  used 
to  analyze  the  data:  RMSE,  R  squared,  the  F  statistic,  and  the  t  statistic.  With  all  of  the 
groundwork  now  laid,  analysis  of  the  results  can  begin. 
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IV.  Results  Analysis 


Chapter  Overview 

The  results  of  the  research  were  mixed.  After  outlining  the  statistics  calculated  to 
cross-validate  data  from  different  satellites,  a  possible  explanation  for  the  results  will  be 
offered.  The  data  was  then  split  and  cross-validated.  A  comparison  was  done  between 
the  cross-validated  root  mean  squared  predictor  error  values  and  Harris’s  root  mean 
squared  error  values  using  the  old  CV  coefficients. 

Data  Sets 

The  data  was  sorted  by  season  and  land  type.  The  bulk  of  data  analysis  was  done 
using  F10/F13  data  from  October  1996  and  F14  data  from  September  1997.  Since  there 
was  no  F14  data  to  coincide  seasonally  with  Jan/Feb  1997  data,  and  since  some  of  the 
CONUS  data  from  August  1996  was  unreadable,  these  data  sets  were  not  used  in  the 
main  research.  However,  some  non- validated  multiple  linear  regression  was  performed 
on  these  data  sets;  the  results  of  these  are  in  Appendix  A. 

Qualitative  Regression  Results 

After  the  indicator  variable  was  added  to  the  data  as  described  in  Chapter  III, 
stepwise  linear  regressions  were  performed  upon  the  data  (not  including  the  indicator 
variable)  to  determine  which  of  the  brightness  temperature  channels  were  statistically 
significant  in  calculating  surface  temperature.  After  stepwise  linear  regression  was  used 
to  eliminate  the  channel  temperatures  that  did  not  significantly  contribute  to  the 
regression,  a  second  multiple  linear  regression  was  performed,  including  the  indicator 
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variable,  to  determine  if  satellite  identity  was  a  significant  factor  in  the  regression. 
Following  are  the  results  of  this  qualitative  regression: 


Table  4.  Results  of  Qualitative  Regression 
Null  Hypothesis:  Satellite  Identity  is  not  significant  in  regression 
Critical  P-Value  for  Rejection  of  Null:  0.01 


Land  Type 

Dry,  Arable  Soil 
Moist  Soil 
Semidesert 
Desert 

Dense  Vegetation 
Mixed  WaterA/egetation 
Light  Vegetation 
Wet  Soil 

Indeterminate  Land  Type 


#  of  Data  Sets 

T  Statistic 

P  Value 

Null  Rejected 

681 

-2.34 

0.0196 

No 

346 

0.3688 

0.7125 

No 

48 

3.514 

0.0007 

Yes 

305 

2.05 

0.0408 

No 

83 

0.8341 

0.4068 

No 

0 

N/A 

N/A 

N/A 

740 

1.2467 

0.2129 

No 

205 

5.261 

0 

Yes 

1292 

14.0974 

0 

Yes 

To  summarize  the  results  above,  the  satellite  from  which  the  data  was  gathered 
does  not  appear  to  impact  the  regression  significantly  in  the  cases  of  dry  soil,  desert, 
dense  vegetation,  light  vegetation,  and  moist  soil  land  types.  The  satellite  identity  does 
appear  to  have  a  significant  impact  upon  the  regression  coefficients  of  the  wet  soil, 
semidesert  and  indeterminate  land  types. 


Discussion 

Note  the  extreme  disparity  between  the  t  statistics  for  the  calculated  land  types 
compared  to  indeterminate  land  types.  The  t  statistics  for  the  calculated  land  types  did 
not  exceed  5.261,  while  the  t  statistic  for  the  miscellaneous  category  was  a  rather  large 
14.0974.  From  this,  it  would  appear  the  correlation  between  satellite  identity  and 
regression  coefficients  is  far  less  pronounced  when  the  CV  algorithm  can  determine  a 
land  type.  The  most  plausible  explanation  is  that  the  earth  locations  that  registered  as 
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“indeterminate”  land  types  changed  between  the  F10/F13  and  F14  data  sets.  Because  the 
data  is  from  different  years,  it  is  possible  the  weather  was  significantly  different  at  some 
locations.  The  land  type  algorithms  are  especially  sensitive  to  heavy  rainfall  (McFarland, 
1991);  a  heavy  rain  event  one  year  but  not  the  other  could  send  a  large  number  of  data 
points  into  the  “indeterminate”  category. 

Significant  differences  between  the  F10/F13  and  F14  data  sets  might  also  explain 
the  rejection  of  the  Null  Hypothesis  for  the  Wet  Soil  and  Semidesert  land  types. 
Unfortunately,  due  to  data  constraints,  the  two  data  sets  come  from  subsequent  years  - 
and  not  even  exactly  the  same  time  of  year.  Differences  in  precipitation,  soil  moisture, 
and  even  evaporation  rates  can  cause  changes  in  surface  radiation  fluxes  (Betts  and  Ball, 
1998).  Therefore,  it  is  entirely  possible  that  the  statistical  differences  found  in  the  data 
sets  for  the  two  recalcitrant  land  types  were  due  to  factors  other  than  satellite  identity. 

The  significant  finding  of  this  research  is  that,  despite  other  possible  significant 
differences  between  data  sets,  the  Null  Hypothesis  was  NOT  rejected.  In  other  words, 
rejection  of  the  Null  Hypothesis  for  the  Semidesert  and  Wet  Soil  land  types  merely 
indicates  there  was  a  statistically  significant  difference  between  the  two  data  sets;  it  was 
not  determined  if  this  difference  is  due  to  satellite  identity  or  other  intrinsic  differences  in 
the  data  set. 

Regression  Coefficient  Refinement 

For  the  five  land  types  whose  regression  coefficients  do  not  appear  to  be  affected 
by  satellite  identity,  multiple  linear  regression  on  all  of  the  data  from  the  FI  0,  FI  3  and 
FI  4  satellites  was  performed.  Following  are  the  results  of  the  regression.  The  regression 
results,  scatter  plots  and  coefficients  are  in  the  tables  and  diagrams  below.  The  scatter 
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plots  in  figures  1 1-15  are  “45  degree”  plots  of  the  actual  temperature  versus  the 
temperature  value  calculated  by  the  regression. 
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TEMPK 


Figure  1 1 :  Scatter  Plot  for  Dry  Soil  Land  Type 


Regression  Equation:  Tfit  =  25.8771  +  0.2307  H19  —  0.4528  V19  +  0.4578  V22  + 


0.4385  V37-  0.4624  H85  +  0.7436  V85 


280  290  300 


Fitted  :  VI 9  +  V22  +  V37  +  H85  +  V85 


Figure  12:  Scatter  Plot  for  Moist  Soil  Land  Type 
Regression  Equation:  Tf,t  =  18.9090  —  0.3197  V19  +  0.6129  V22  +  0.4926  V37 

0.8541  H85  +  1.0402  V85 


Figure  13:  Scatter  Plot  for  Desert  Land  Type 
Regression  Equation:  Tfu  =  32.5467  +  0.5766  V22  —  1.001 1  V37  —  0.2832  H85  + 

1.1725  V85 


Fitted  :  V22  +  H37  +  H85  +  V85 


Figure  14:  Scatter  Plot  for  Dense  Vegetation  Land  Type 
Regression  Equation:  Tfu  =  (-14.531 1)  +  0.7351  V22  +  0.6965  H37  —  1.601  H85  + 

0.8194  V85 
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Figure  15:  Scatter  Plot  for  Light  Vegetation  Land  Type 
Regression  Equation:  Tfu  =  24.0912  -  0.21 10  H19  +  0.2663  V19  +  0.5340  V22  + 
0.4725  V37  -  0.3786  H85  +  0.2688  V85 


Table  5.  Combined  Regression  Results  (All  Satellites,  F10/F13/F14) 
Null  Hypothesis:  All  regression  coefficients  =  0 


Land  Type 

#  of  Data  Sets 

F  Statistic 

P  Value 

Null  Rejected 

RMSE(K) 

R  squared 

Dry,  Arable  Soil 

681 

590.4 

0 

Yes 

3.301 

0.8402 

Moist  Soil 

346 

293.1 

0 

Yes 

3.089 

0.8117 

Desert 

305 

161.3 

0 

Yes 

3.743 

0.7295 

Dense  Vegetation 

83 

121.1 

0 

Yes 

3.053 

0.8613 

Light  Vegetation 

740 

557.4 

0 

Yes 

2.825 

0.8202 

It  is  certainly  noteworthy  that  the  R  squared  values  are  quite  high.  Somewhat  surprising 
is  that  the  highest  R  squared  value  comes  from  the  dense  vegetation  land  type,  the 
category  with  the  smallest  sample  size.  Following  is  a  table  that  compares  the  calculated 
RMSE  values  of  this  research  to  the  RMSE  values  in  McFarland’s  original  research. 
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Table  6:  RMSE  (K)  Comparison  to  Original  CV  Regression 


Land  Type  McFarland  RMSE  Adair  RMSE 

Dry  Soil  3.60*  3.301 

Moist  Soil  2.78**  3.089 

Desert  3.60*  3.743 

Dense  Vegetation  3.45***  3.053 

Light  Vegetation  2.69  2.825 


*  Combined  regression  for  Dry  Soil,  Semidesert  and  Desert  Land  Types 
**  Combined  regression  for  Moist  Soil  and  Wet  Soil  Land  Types 

***  Combined  regression  for  Dense  Vegetation  and  Mixed  Water/V egetation  Land  Types 


Cross-validation  of  the  data  sets 

It  is  important  to  note  the  above  regressions  are  for  qualitative  judgement  only. 
The  overall  data  sets  were  not  split  into  training  and  validation  sets  -  rather,  the  rejection 
of  the  Null  Hypothesis  in  the  indicator  variable  regression  (see  Table  4)  was  used  as 
evidence  the  F10/F13  data  and  the  F 14  data  for  the  five  land  types  had  identical  true 
regression  equations.  However,  to  justify  further  the  value  CV  coefficient  refinement, 
the  data  was  split  and  cross-validated.  First,  F10/F13  data  from  each  land  type  was 
regressed;  then,  the  coefficients  were  applied  to  the  respective  FI 4  data  and  MSPR  values 
were  calculated.  Next,  the  reverse  was  done  -  F14  data  was  regressed  and  then  the 
coefficients  were  used  on  the  F10/F13  data  for  calculation  of  MSPR.  Finally,  the  MSPR 
values  were  compared  with  the  RMSE  values  found  in  Harris’s  research.  Following  are 
the  scatter  plots  of  the  cross-validations,  followed  by  two  tables  summarizing  the  results. 
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Observed  Temperature 


F 14  Data:  Desert 


Fitted  Value  Using  F10/F13  Coefficients 

Figure  19:  Cross  -Validation  of  F10/F13  Coefficients  on  F14  Data  -  Light  Vegetation 
Tfit  =  10.6856  -  .584  H19  +  .7251  V19  +  .5375  V22  +  .869  V37  -  .5523  H85 


Observed  Temperature  Observed  Temperature 


Figure  20:  Cross-Validation  of  F10/F13  Coefficients  on  F14  Data— Semidesert 


(FAILED) 
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Fitted  Value  Using  F10/F13  Coefficients 

Figure  21 :  Cross-validation  of  F10/F13  Coefficients  on  F14  Data  -  Wet  Soil  (FAILED) 


F 14  Data:  Wet  Soil 
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Observed  Temperature 


F10/F13  Data:  Dry  Soil 


Fitted  Value  Using  F 14  Coefficients 

Figure  23:  Cross-validation  of  F14  Coefficients  Using  F10/F13  Data  -  Moist  soil 
Tfit  =  33.8725  +  .5246  H37  -  .9232  H85  +  1 .3298  V85 
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F10/F13  Data:  Desert 
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Fitted  Value  Using  F14  Coefficients 

Figure  24:  Cross-validation  ofF14  Coefficients  Using  F10/F13  Data -Desert 
Tflt  =  57.0802  +  .6646  V22  +  .  1 859  V85 

F10/F13Data:  Dense  Vegetation 


Fitted  Value  Using  F14  Coefficients 

Figure  25:  Cross-validation  of  F14  Coefficients  using  F10/F13  Data  —  Dense  Vegetation 
Tflt  =  8.2529  +  .4995  HI 9  -  1.1686  H85  +  1.6808  V85 
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F10/F13  Data:  Light  Vegetation 


Fitted  Value  Using  F14  Coefficients 

Figure  27:  Cross-validation  of  F14  Coefficients  Using  F10/F13  Data  -  Semidesert 

(FAILED) 
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F10/F13  Data:  Wet  Soil 
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Fitted  Value  Using  F14  Coefficients 

Figure  28:  Cross-validation  of  F14  Coefficients  Using  F10/F13  Data  —  Wet  Soil 

(FAILED) 


Table  7:  RMSE  and  MSPR  Values  of  F10/F13  Coefficients  Cross-validated  on  F14  Data 


Land  Type 

RMSE 

RMSPR 

R-Squared 

Indicator  Variable  Test 

Dry  Soil 

3.686 

3.297 

0.7499 

Pass 

Moist  Soil 

3.193 

2.877 

0.8033 

Pass 

Wet  Soil 

4.284 

4.409 

0.6254 

Fail 

Semidesert 

3.326 

5.221 

0.7945 

Fail 

Desert 

3.619 

4.773 

0.6791 

Pass 

Light  Vegetation 

3.128 

2.687 

0.6861 

Pass 

Table  8:  RMSE  and  MSPR  Values  of  F14  Coefficients  Cross- validated  on  F10/F13  Data 


Land  Type 

RMSE  (K) 

RMSPR  (K) 

R-Squared 

Indicator  Variable  Test 

Dry  Soil 

2.957 

3.853 

0.785 

Pass 

Moist  Soil 

2.534 

3.708 

0.8188 

Pass 

Wet  Soil 

3.547 

5.174 

0.665 

Fail 

Semidesert 

2.993 

7.072 

0.516 

Fail 

Desert 

2.891 

4.746 

0.6979 

Pass 

Light  Vegetation 

2.456 

3.337 

0.8227 

Pass 

Dense  Vegetation 

2.945 

3.896 

0.8062 

Pass 
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The  results  of  the  cross-validation  were  quite  encouraging.  Even  the  RMSPR  values  of 
the  land  types  that  failed  the  indicator  variable  test  were  comparable  with  the  RMSE 
values  from  Harris.  The  RMSPR  values  for  the  land  types  that  passed  the  indicator 
variable  test  were  lower  than  those  RMSE  values  found  by  Harris. 

Table  9:  RMSE  (K)  of  CV  Algorithm  from  Harris  (1998) 

(For  comparison:  Highest  Validated  RMSPR  From  Research:  4.8  K) 


Summer 

Fall 

Winter 

CONUS 

7.8 

5.6 

6.2 

Bosnia 

19.4 

5.3 

5.2 

Korea 

8 

6.2 

6.2 

Saudi  Arabia 

7.4 

7.4 

7.4 

From  the  above  results,  it  is  reasonable  to  conclude  that  a  comprehensive  linear 
regression  to  refine  brightness  temperature  coefficients  would  yield  more  accurate 
results. 

Perhaps  most  perplexing  were  the  differing  results  by  land  type  between  the  two 
data  sets.  If  the  F10/F13  and  F14  data  sets  were  insufficiently  similar,  it  would  follow 
that  data  for  all  land  types  would  have  shown  statistically  significant  differences. 
Conversely,  if  the  data  sets  were  sufficiently  similar,  then  all  of  the  land  types  should 
have  indicated  so.  This  researcher  speculates  that  the  Wet  Soil  and  Semidesert  land 
types  either  (1)  are  more  sensitive  to  changing  weather  between  years  than  other  land 
types,  (2)  are  situated  in  locations  prone  to  more  radical  weather  changes  than  locations 
of  other  land  types,  or  (3)  have  emissivity  characteristics  which  are  more  location 
dependent  than  those  for  other  land  types. 
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Summary 

Qualitative  regression  indicated  satellite  identity  might  not  be  a  significant  factor 
in  determining  the  regression  coefficients  for  five  of  eight  CV  land  types.  The  same 
regression  indicated  satellite  identity  might  be  a  significant  factor  for  two  land  types,  as 
well  as  for  data  of  an  indeterminate  land  type.  The  “Mixed  Water  and  Vegetation”  land 
type  had  no  data  ascribed  to  it,  and  so  the  regression  for  that  land  type  could  not  be 
performed.  Multiple  linear  regression  was  performed  upon  the  five  satellite-independent 
land  types.  Results  indicated  reasonable  RMSEs  and  high  R  squared  values.  Cross- 
validation  yielded  MSPRs  that  compared  favorably  with  RMSEs  found  from  Harris’s  use 
of  the  original  coefficients.  Therefore,  it  appears  CV  could  be  made  more  accurate  with 
revised  coefficients. 
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V-  Summary.  Conclusions  and  Recommendations 


Summary 

Using  the  statistical  methods  of  multiple  linear  regression,  stepwise  linear 
regression,  and  qualitative  regression,  3,700  data  sets  from  Fall  of  1996  and  Fall  of  1997, 
including  microwave  brightness  temperatures  from  three  satellites,  were  analyzed  to 
determine  if  satellite  identity  had  a  significant  impact  on  CV  regression  coefficients. 
Analysis  indicated  that  satellite  identity  does  not  appear  to  have  a  significant  impact  on 
the  regression  coefficients  for  five  of  the  eight  CV  land  types  investigated.  Analysis  of 
two  CV  land  types  indicated  satellite  identity  might  have  a  significant  impact,  while  there 
was  insufficient  data  to  determine  the  impact  for  one  CV  land  type. 

In  addition  to  the  qualitative  regression,  stepwise  linear  regression  was  performed 
on  five  land  type  categories  using  combined  data  from  all  satellites.  Regressed  RMSEs 
ranged  from  2.825  K  to  3.743  K,  while  R  squared  values  ranged  from  .7295  to  .8613. 
Preliminary  analysis  indicated  refinement  of  CV  brightness  temperature  coefficients  may 
yield  better  accuracy  for  the  algorithm. 

Finally,  the  data  was  cross-validated  by  splitting  the  data,  regressing  each  set,  and 
calculating  the  MSPRs  when  the  regressed  coefficients  were  used  on  the  other  data  set. 
The  results  of  the  cross-validation  confirmed  the  results  of  the  indicator  variable 
regression. 
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Conclusion  and  Recommendation  for  AFWA 

The  assumption  that  CV  coefficients  need  not  be  recalculated  for  each  satellite 
appears  sound  for  most  of  the  data.  However,  satellite  identity  might  have  a  significant 
impact  upon  the  regression  for  at  least  two  of  the  land  types.  Therefore,  AFWA  should 
consider  further  testing  the  algorithms  for  the  semidesert  and  wet  soil  land  types  to  see  if 
there  is  indeed  satellite  dependence  for  these  two  land  types.  Because  of  unique 
properties  of  “indeterminate”  land  type  data  as  discussed  in  Chapter  4,  AFWA  should 
investigate  regressing  location  and  rainfall  dependent  coefficients  for  data  which  the  CV 
algorithm  cannot  determine  a  land  type. 

Recommendations  for  Further  Research 

The  first  research  project  I  recommend  would  be  to  investigate  the  characteristics 
of  the  Semidesert  and  Wet  Soil  land  types.  While  it  is  possible  these  two  land  types  are 
satellite  dependent  for  some  reason,  it  is  more  likely  that  the  data  needs  to  be  split  into 
other  categories.  For  example,  it  is  possible  the  data  needs  to  be  split  depending  on  the 
seasonal  amount  of  rainfall  received.  Perhaps  the  differences  would  be  ameliorated  if  a 
larger  data  set  were  used  to  include  more  varied  cases.  Finally,  the  land  types  could  be 
split  by  the  amount  of  rainfall  received  in  a  given  period  of  time  -  for  example,  the 
previous  12  hours. 

The  first  part  of  the  above  project,  determination  of  satellite  dependence,  would 
be  relatively  straightforward.  A  large  number  of  data  sets  from  identical  areas  of  known 
semidesert  and  wet  soil  land  types  (that  is,  land  types  known  beforehand  rather  than 
determined  by  the  CV  algorithm)  and  near-identical  times  could  be  calculated  using 
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different  DMSP  satellites.  Qualitative  regression  could  then  be  performed  as  was  done  in 
this  research.  If  the  statistics  still  show  a  significant  difference  between  the  data  sets, 
then  sensor  dependence  can  be  shown. 

Assuming  no  sensor  dependence,  the  second  part  of  the  project  would  be 
considerably  more  involved.  SSM/I  brightness  temperatures  would  need  to  be  correlated 
not  just  with  observed  temperature,  but  also  with  current  precipitation  intensity, 
precipitation  amounts  for  a  given  time  in  the  past,  and  climatological  precipitation  data 
by  location  and  season.  The  current  precipitation  intensity  could  be  calculated  using 
SSM/I  rain  rate  algorithms  (e.g.  Ferraro  et  al.,  1998),  but  determining  and  correlating  the 
latter  two  parameters  would  be  a  time-intensive  and  data  sparse  process. 

Another  avenue  of  follow-on  research  would  be  to  modify  the  CV  algorithm’s 
coefficients  and  improve  its  accuracy.  All  available  DMSP  data  should  be  combined 
with  synoptic  observations  to  create  an  enormous  database.  FORTRAN  programs  could 
then  be  written  to  match  data,  separate  data  into  land  types,  and  separate  data  by  satellite 
identity.  Data  could  be  combined  and  regressed  for  five  of  the  land  types,  while  separate 
regressions  by  either  satellite,  precipitation  intensity  or  location  could  be  accomplished 
for  the  semidesert  and  wet  soil  land  types.  Finally,  a  single  FORTRAN  program  could  be 
constructed  to  accomplish  all  of  the  above  tasks  and  incorporate  the  calculated 
coefficients. 

Another  possible  use  of  indicator  variables  is  to  determine  if  the  location  of 
observations  (e.g.,  CONUS  versus  Middle  East)  has  an  impact  on  the  regression 
coefficients  (see  Ferraro  et  al.,  1998).  This  research  did  not  divide  the  data  by  location 
because  there  was  insufficient  data  in  some  locations  and/or  land  types  to  perform  a 
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comprehensive  study.  A  study  could  be  done  performing  qualitative  regression  on  an 
indicator  variable  for  two  different  regions. 

Another  avenue  for  further  research  would  be  to  examine  the  effects  of 
precipitation  upon  the  regression  coefficients.  An  indicator  variable  could  be  constructed 
to  represent  precipitation  intensity  and  then  regressed,  using  a  multiple  value  indicator 
variable  instead  of  a  Bernoulli  indicator  variable  (see  Chapter  10  of  Neter  et  al.,  1983). 
Depending  upon  the  results,  an  SSM/I  rain-rate  retrieval  algorithm  (see  Conner  and  Petty, 
1998)  could  be  incorporated  into  CV  to  refine  temperature  calculation  further. 

This  research  did  not  address  the  possibility  of  errors  introduced  by  the  surface 
type  algorithms  themselves.  A  possible  topic  of  future  research  would  be  to  examine  the 
locations  of  the  data  matches  sorted  into  various  land  types.  A  map  outlining  locations 
and  categories  of  various  land  type  matches  could  be  generated  to  see  if  such  land  types 
exist.  For  example,  if  there  were  a  number  of  “moist  soil”  land  type  hits  in  the  middle  of 
the  Sahara  Desert,  or  a  number  of  “desert”  hits  in  the  Amazon,  it  would  indicate  the  land 
type  sorting  algorithms  are  in  error. 

Finally,  research  could  be  performed  to  determine  if  a  polynomial  regression 
would  fit  the  data  better.  However,  while  the  regression  indicated  slight  deviation  from 
normality  at  extremely  high  and  low  temperatures  (see  Appendix  B),  the  high  R  squared 
values  suggest  research  time  might  be  spent  first  on  the  research  topics  outlined  above. 
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Appendix  A:  Other  Repression  Performed  but  Not  Directly  Used  in  the  Research 
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Fitted  :  H19  +  V19  +  V22  +  H37  +  H85  +  V85 


Figure  Al:  Scatter  Plot  for  Dry  Soil  Land  Type,  Jan-Feb  1997,  F10/F13 
Regression  Equation:  Tgt  =  8.0986  -  0.1447  H19  —  0.2169  V19  +  1.0707  V22  + 
0.2780  H37  -  0.2762  H85  +  0.3041  V85 


O  O  CD  O  O  O 

O  ©  O 


I  _  VAUQ 

o  e  <fc_QD0c 
o  Oj  CO  ao  a jjr 
10  ©  ©oo.  6  o  g 


OO  ©  ©DO  o 

o  o  o  n  o  oooo 
®  oocr  .  oo  <s>  c 

OOO  RdQD  (SXCODO 
3  6>aoar&cBo 

CD  OSD  Q©  O  ©O  O 


_  O  O  O  O  <ff>  axa© 
°0  o  wO  CD  ® 

00  <111.111111  >®  OOO 

OQO  _  00®ocP  00 
o  9  8  oqq  o  o 


Fitted  :  H19  +  V19  +  V22  +  H85  +  V85 


Figure  A2:  Scatter  Plot  for  Moist  Soil  Land  Type,  Jan-Feb  1997,  F10/F13 
Regression  Equation:  Tflt  =  21.1260  -  0.3748  H19  +  0.3105  V19  + 
0.5380  V22  -  0.4812  H85  +  0.9579  V85 
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gure  A3:  Scatter  Plot  for  Semidesert  Land  Type,  Jan-Feb  1997,  F10/F13 
;ssion  Equation:  Tf,t  =  88.4597  —  0.6082  H19  +  0.5872  V19  +1.1 129  V22  + 
0.3985  H37  -  1.1530  V37  +  0.3645  V85 
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Fitted  :  V19  +  V22  +  H37  +  V37  +  V85 


Figure  A4:  Scatter  Plot  for  Desert  Land  Type,  Jan-Feb  1997,  F10/F13 
Regression  Equation:  Tf,t  =  94.7734  +  0.4267  V19  +  0.3087  V22  +  0.2945  H37  - 

0.7946  V37  +  0.4776  V85 
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Fitted  :  H19  +  V22  +  H37  +  H85 


Figure  A5:  Scatter  Plot  for  Light  Vegetation  Land  Type,  Jan-Feb  1997,  F10/F13 
Regression  Equation:  Tf,t  =  (-44.3633)  +  0.5181  HI 9  +  0.7982  V22  +  0.8012  H37 

0.8961  H85 


Figure  A6:  Scatter  Plot  for  Wet  Soil  Land  Type,  Jan-Feb  1997,  F10/F13 
Regression  Equation:  Tfit  =  137.6667  -  0.2642  H19  +  0.3931  V19  +  0.3170  V22 
0.3191  V37  -  0.1080  H85  +  0.5019  V85 
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Fitted  :  V19  +  V22  +  H37  +  V37  +  H85  +  V85 

Figure  A7:  Scatter  Plot  for  Indeterminate  Land  Type,  Jan-Feb  1997,  F10/F13 
Regression  Equation:  Tfit  =  161.3059  -  0.2227  V19  +  0.4300  V22  -  0.1430  H37  + 
0.0614  V37  -  0.1225  H85  +  0.4366  V85 


Table  A1 :  Regression  Results  (F10/F13)  for  Jan-Feb  1997 
Null  Hypothesis:  All  regression  coefficients  =  0 


Land  Type 

#  of  Data  Sets 

F  Statistic 

P  Value 

Null  Rejected 

RMSE(K) 

R  squared 

Dry,  Arable  Soil 

579 

247.8 

0 

Yes 

3.413 

0.7222 

Moist  Soil 

283 

141.8 

0 

Yes 

2.981 

0.719 

Semidesert 

184 

88.3 

0 

Yes 

4.214 

0.7496 

Desert 

259 

88.8 

0 

Yes 

2.553 

0.6369 

Dense  Vegetation 

0 

N/A 

N/A 

N/A 

N/A 

N/A 

Mixed  WaterA/eg 

0 

N/A 

N/A 

N/A 

N/A 

N/A 

Light  Vegetation 

112 

89.1 

0 

Yes 

2.773 

0.7691 

Wet  Soil 

758 

208.8 

0 

Yes 

3.831 

0.6252 

Indeterminate 

1881 

725.3 

0 

Yes 

4.581 

0.699 
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Appendix  B:  Residual  and  Normality  Plots  of  the  F10/F13/F14  Regression 


Figure  Bl:  Residual  Plot  for  Dry  Soil  Land  Type 


Quantiles  of  Standard  Normal 


Figure  B2:  Normality  Plot  for  Dry  Soil  Land  Type 
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Fitted  :  V22  +  H37  +  V37  +  H85  +  V85 

Figure  B5:  Residual  Plot  for  Desert  Land  Type 
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Figure  B6:  Normality  Plot  for  Desert  Land  Type 


Appendix  C:  CV  Land  Type  Algorithms 


If{V  22-F19)<4.0  and 
4.0<(F19  +  F37)/2<9.8  and 
(F37-F19)  >  -6.5  and 
-  5.0  <  (F85  -  F37)  <  0.5  and 
(7/85-/737)  <4.2  then 
land  type  =  dry,  arable  soil 

7jf(F22-F19)<4.0  and 
4.0  <  (F19  +  F37)/ 2  <  19.7  and 
(F37-F19)  >  -6.5  and 
0.5  <  (F85  -  F37)  <  4.0  and 
(F85-/737)  <  4.2  then 
land  type  =  moist  soil 


7f(F22-F19)<4.0  and 
9.8  <(F19  +  F37)/2<  19.7  and 
(F37-F19)  >  6.5  and 
(F85  -F37)  <  0.5  and 
(//37-//19)  <  -1.8  and 
(//85  — //37)  <  6.0  rAen 
land  type  =  semidesert 


If(V22  -  F19)  <  2.0 
(F19  +  F37)/2  >  19.7 
F19  >  268  and 
(//85- 7737)  >-1.0  to 
land  type  =  desert 
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If  (V22-V19)  <  4.0  and 
(F19  +  F37)/2>6.4  and 
(F37-F19)  >  -6.5  and 
(F85-F37)  <  0.5  an</ 
(7785-7737)  >  4.2  to 
land  type  =  wet  soil 


If(V22  -  F19)  <  4.0  antf 
(F19  +  F37)/2  <1.9  and 
(F85  -F37)  >  -1.0  a«</ 
(7785  -  7737)  <4.5  and 
F19>  262.0  to 
land  type  =  dense  vegetation 


If(V22  -  F19)  <  4.0  and 
(F19  +  F37)/2  <6.4  a«</ 
(F85-F37)>-1.0  and 
(7785  -  7737)  >4.5  a«</ 

(F37  -  7737)  >257.0  to 

land  type  =  mixed  water  and  vegetation 


If(V22  -  F19)  <  4.0  am/ 

1.9  <  (F19  +  F37)/2  <  4.0  and 
(F85-F37)  >  -1.0  am/ 

(7785  -  7737)  <4.5  anJ 

F19>  262.0  to 

land  type  =  less  dense  vegetation 
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